Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8D499200BE8 for ; Fri, 23 Dec 2016 12:21:54 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 8C03D160B1E; Fri, 23 Dec 2016 11:21:54 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id AC3D3160B1D for ; Fri, 23 Dec 2016 12:21:53 +0100 (CET) Received: (qmail 94805 invoked by uid 500); 23 Dec 2016 11:21:47 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 94787 invoked by uid 99); 23 Dec 2016 11:21:47 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Dec 2016 11:21:47 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id D258BC03A0 for ; Fri, 23 Dec 2016 11:21:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.48 X-Spam-Level: X-Spam-Status: No, score=0.48 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=mikemccandless-com.20150623.gappssmtp.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id W_4SQ9MTz0AJ for ; Fri, 23 Dec 2016 11:21:45 +0000 (UTC) Received: from mail-it0-f42.google.com (mail-it0-f42.google.com [209.85.214.42]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 4C3A95F570 for ; Fri, 23 Dec 2016 11:21:45 +0000 (UTC) Received: by mail-it0-f42.google.com with SMTP id o141so80823556itc.0 for ; Fri, 23 Dec 2016 03:21:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mikemccandless-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-transfer-encoding; bh=1BfSbr7Ewyjt6R3ks3CZ0KqEQte0J/yq0McN+Ae5Juo=; b=KsUZMWrb2tLP/f/lXK7C5OsZqHcYEoijWuXI3o97L+J8vxHEHOMJBssv8hpeqsWwP/ Dnflmg2+vPFdy9RG+n8+JrzvEVFOaZjte15+qp9j5jUyEAC8itv7TrDSD6A66jkPJ4d9 BKr1ZmYWOLsby2c7pY8sBpdjnHirwgzaMMhthznnuHMbHWivBucyZfLB9N/FCABpCm0j 1I0LkLJu+jn3UDrv9KXvVpcZIP0YrL+z1hAvP1OGtmhim8fN8xOzWheUpaJhG6PTD55z 4Ym4Puk9qRo0FndgiRMazgig6nkiQyJbJZOpvsIvtjXDPZOjZBLx9Zrn2RqDIqp5RmEa OsSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-transfer-encoding; bh=1BfSbr7Ewyjt6R3ks3CZ0KqEQte0J/yq0McN+Ae5Juo=; b=KJhbRMsDNTr+2UpqFKNy+BjY5xZxXVJxSdJ3rOpoCvo/4BmNkpzIu9iOEDSZU5p+2B p//yg2dcZE5sWJwMbtGaQ1m4JLyeD37E4z0NGjkRVGmJ0pjLpojM89NpSzQImAsHoZyr ttYS6ZIpxNcXytCu5xzsfWr+bNshufc6K7yw923meJpCIrFkHuwLH86NrcfoVdIRMypQ oyDMc2Aiv+cg3mUClujNmPzchSYCpNnbknyD/ndjZYCFQglWYfzaoUzuMlI3rdN+o0PY oTA4oJ5S2yxpn1zh6SEniH9KdvgWFiYP4lbi88NmhFHmimIzA3MXSVkvy0UBFubq3rvX ZloA== X-Gm-Message-State: AIkVDXL3SlAYvXeT3RO7RCUnf/yZGFbOoqsDcMrDij4Ni4rO/3ZOR7winSYciFCYf8kvpXzPxcHvnkSHIux0Ig== X-Received: by 10.36.79.10 with SMTP id c10mr16249893itb.56.1482492099482; Fri, 23 Dec 2016 03:21:39 -0800 (PST) MIME-Version: 1.0 Received: by 10.107.162.136 with HTTP; Fri, 23 Dec 2016 03:21:19 -0800 (PST) In-Reply-To: References: From: Michael McCandless Date: Fri, 23 Dec 2016 06:21:19 -0500 Message-ID: Subject: Re: Sorting, Range Query, faceting - NumericDocValuesField Vs LongField To: Lucene Users , Kumaran R Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable archived-at: Fri, 23 Dec 2016 11:21:54 -0000 Note that Erick is giving you the Solr syntax below, but if you are using Lucene directly, that obviously doesn't apply (though the same general concepts do). I would strongly recommend not using uninversion: it's an archaic and costly option that Lucene only offered long ago because it didn't have doc values, but that changed many years ago now. Also the new dimensional points (IntPoint, LongPoint) give better performance than the legacy postings based ("trie") numerics. See https://www.elastic.co/blog/apache-lucene-numeric-filters for some of the history here ... Mike McCandless http://blog.mikemccandless.com On Thu, Dec 22, 2016 at 10:37 PM, Erick Erickson wrote: > bq: Does this mean LongField/IntField just supports lexicographic > order in sorting? > > no on several counts. > > No numeric type (long, int, float, double or trie values) support > lexicographic sorting. That's the whole _point_ of having numeric > types in the first place. Well, and efficient range queries in the > Trie variants. > > docValues are an additional _attribute_ on the field so it's perfectly > reasonable to have a long field that's both > indexed=3D"true" and docValues=3D"true". Or > indexed=3D"true" and docValues=3D"false". Or > indexed=3D"false" and docValues=3D"true". Or > indexed=3D"false" and docValues=3D"false" > > Do not think of them as separate field types. > > indexed=3D"true" is _required_ for searching. A field with > indexed=3D"true" and docValues=3D"false" also supports faceting, grouping > and sorting (numeric). > > A field with docValues=3D"true" just supports faceting, grouping and > sorting without having to "uninvert" the field in the Java heap, the > data is out in OS cache. See Uwe's excellent blog here: > http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html > > Best, > Erick > > On Thu, Dec 22, 2016 at 6:57 PM, Kumaran Ramasubramanian > wrote: >> Thank you Adrien. >> >> "NumericDocValuesField is the one that supports sorting." >> >> Does this mean LongField/IntField just supports lexicographic order in >> sorting? >> >> >> - >> Kumaran R >> >> >> >> On Dec 22, 2016 11:28 PM, "Adrien Grand" wrote: >> >> Le jeu. 22 d=C3=A9c. 2016 =C3=A0 18:50, Kumaran Ramasubramanian >> a =C3=A9crit : >> >>> I want to provide sorting, range search and faceting in numeric fields. >>> >>> AFAIK, Purpose of different numeric field types are, >>> >>> NumericDocValuesField supports sorting and faceting >>> LongField/IntField supports range query and sorting >>> >> >> LongField/IntField only support querying, NumericDocValuesField is the o= ne >> that supports sorting. >> >> Also note that as of 6.0 LongField and IntField have been replaced with >> LongPoint and IntPoint. >> >> >>> 1. Should i duplicate one field in above mentioned types to achieve all >> the >>> three features in numeric? >>> >> >> Yes. By the way it is perfectly fine to use the same field name for the >> point field and the doc values field. >> >> >>> 2. If i am ready to sacrifice faceting, is it advisable to use LongFiel= d >>> for sorting and range query? >>> >> >> Like said above you need doc values for sorting. >> >> >>> 3. During sorting, Will NumericDocValuesField( column stride storage) >>> perform better than LongField(trie structure)? If so , should i duplica= te >>> field in both 1 and 2 cases? >>> >> >> Same note here. >> >> Adrien > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org