Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id B9614200BE8 for ; Fri, 23 Dec 2016 12:41:15 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id B82B5160B1E; Fri, 23 Dec 2016 11:41:15 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id DD4DC160B1D for ; Fri, 23 Dec 2016 12:41:14 +0100 (CET) Received: (qmail 17934 invoked by uid 500); 23 Dec 2016 11:41:13 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 17921 invoked by uid 99); 23 Dec 2016 11:41:13 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Dec 2016 11:41:13 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 0C06DC03A0 for ; Fri, 23 Dec 2016 11:41:13 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.129 X-Spam-Level: ** X-Spam-Status: No, score=2.129 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id y465f_ZMY9e9 for ; Fri, 23 Dec 2016 11:41:11 +0000 (UTC) Received: from mail-oi0-f51.google.com (mail-oi0-f51.google.com [209.85.218.51]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 198C65F1C2 for ; Fri, 23 Dec 2016 11:41:10 +0000 (UTC) Received: by mail-oi0-f51.google.com with SMTP id 128so73053991oig.0 for ; Fri, 23 Dec 2016 03:41:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=XG5V5miwKaWYOyeX25Mp89DSCj4NkIzSvBxBwGbagwQ=; b=q/BNz2otP6ytPIoOvwZsWzevPDrOxmsjuf7QC4EqQcU8KFl/M2lFqEp9YU8zA0dEHn wx4BxWgceiJ5zeTkFRYiuXAUvFftNgmoduZXRAbVtRBBouml5jk63bbuix04HRVUcxbt QdqHTNFzxqSMyCnJuPRlqeSlFuDwjRy9ZdEuI4jfqpVK4HXYxDwBrSwD4LLyiLflRGWj LSyMDUNBYcIdyUiKvfheV5c9kDC00knHrqsxVyWgUKqL4c6QSGfTXt0FR3SMud91LEV0 f79vIoedzZ0cVWMsAu42T6ecMz9i+3J3eiw02JRJYqw53A7TbRPbl5VTIK5gubcku/IB P5Gg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=XG5V5miwKaWYOyeX25Mp89DSCj4NkIzSvBxBwGbagwQ=; b=butb2bfknReREokmBdKR/jyd6WpuYQUsWCXvyaiLWDOa83LD4X6CSuCtmVA9BXHbNq mXIwKabyvtm5MJ3pw/99c5jR+/pgSZy6X87qA2V87M77HtEZ9yQIhp5DWQ/wPGbwIDoF C4FAdN6Quaw/hwDhyxMcDRLxfutK03bzg6POwZ4vJhFl59KceoEQ/9j8XNIN0qA8njgd EZRD4AGr3pFfRGKAb2jAkeHDhAQyvbVTHKgASBIaxPXN2Ti5ECGLPkRRmveqqIffRfQo e0j3tBjBfv32bjkVEGaO4FuiTYuBupR3qx6O1hyUsKlg4V9ZazEgLvSt4hPWyagmsTEg mWBQ== X-Gm-Message-State: AIkVDXIcWCj122nGrwWU8MqebGXTOhbmUCCfw5X48EC0sRneQ3yl5c3xAF7qKrclclCD7CWFrk5mqvuE/a0lGw== X-Received: by 10.157.16.70 with SMTP id o6mr7088804oto.101.1482493259188; Fri, 23 Dec 2016 03:40:59 -0800 (PST) MIME-Version: 1.0 Received: by 10.202.207.21 with HTTP; Fri, 23 Dec 2016 03:40:18 -0800 (PST) In-Reply-To: References: From: Kumaran Ramasubramanian Date: Fri, 23 Dec 2016 17:10:18 +0530 Message-ID: Subject: Re: Sorting, Range Query, faceting - NumericDocValuesField Vs LongField To: Lucene Users Content-Type: multipart/alternative; boundary=001a11413ccc188def054451dcb0 archived-at: Fri, 23 Dec 2016 11:41:15 -0000 --001a11413ccc188def054451dcb0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks Erick and Mike. i am using lucene 4.10.4 directly. i have observed better performance in LongField compared to lexicographic sorting. i can understand, it is due to trie structure of LongField, But one more doubt, Will uninversion process happen in IntField / LongField too? Thanks for the link mike. i will look into LongPoint in recent versions. -- Kumaran R On Fri, Dec 23, 2016 at 4:51 PM, Michael McCandless < lucene@mikemccandless.com> wrote: > Note that Erick is giving you the Solr syntax below, but if you are > using Lucene directly, that obviously doesn't apply (though the same > general concepts do). > > I would strongly recommend not using uninversion: it's an archaic and > costly option that Lucene only offered long ago because it didn't have > doc values, but that changed many years ago now. > > Also the new dimensional points (IntPoint, LongPoint) give better > performance than the legacy postings based ("trie") numerics. > > See https://www.elastic.co/blog/apache-lucene-numeric-filters for some > of the history here ... > > Mike McCandless > > http://blog.mikemccandless.com > > > On Thu, Dec 22, 2016 at 10:37 PM, Erick Erickson > wrote: > > bq: Does this mean LongField/IntField just supports lexicographic > > order in sorting? > > > > no on several counts. > > > > No numeric type (long, int, float, double or trie values) support > > lexicographic sorting. That's the whole _point_ of having numeric > > types in the first place. Well, and efficient range queries in the > > Trie variants. > > > > docValues are an additional _attribute_ on the field so it's perfectly > > reasonable to have a long field that's both > > indexed=3D"true" and docValues=3D"true". Or > > indexed=3D"true" and docValues=3D"false". Or > > indexed=3D"false" and docValues=3D"true". Or > > indexed=3D"false" and docValues=3D"false" > > > > Do not think of them as separate field types. > > > > indexed=3D"true" is _required_ for searching. A field with > > indexed=3D"true" and docValues=3D"false" also supports faceting, groupi= ng > > and sorting (numeric). > > > > A field with docValues=3D"true" just supports faceting, grouping and > > sorting without having to "uninvert" the field in the Java heap, the > > data is out in OS cache. See Uwe's excellent blog here: > > http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html > > > > Best, > > Erick > > > > On Thu, Dec 22, 2016 at 6:57 PM, Kumaran Ramasubramanian > > wrote: > >> Thank you Adrien. > >> > >> "NumericDocValuesField is the one that supports sorting." > >> > >> Does this mean LongField/IntField just supports lexicographic order in > >> sorting? > >> > >> > >> - > >> Kumaran R > >> > >> > >> > >> On Dec 22, 2016 11:28 PM, "Adrien Grand" wrote: > >> > >> Le jeu. 22 d=C3=A9c. 2016 =C3=A0 18:50, Kumaran Ramasubramanian < > kums.134@gmail.com> > >> a =C3=A9crit : > >> > >>> I want to provide sorting, range search and faceting in numeric field= s. > >>> > >>> AFAIK, Purpose of different numeric field types are, > >>> > >>> NumericDocValuesField supports sorting and faceting > >>> LongField/IntField supports range query and sorting > >>> > >> > >> LongField/IntField only support querying, NumericDocValuesField is the > one > >> that supports sorting. > >> > >> Also note that as of 6.0 LongField and IntField have been replaced wit= h > >> LongPoint and IntPoint. > >> > >> > >>> 1. Should i duplicate one field in above mentioned types to achieve a= ll > >> the > >>> three features in numeric? > >>> > >> > >> Yes. By the way it is perfectly fine to use the same field name for th= e > >> point field and the doc values field. > >> > >> > >>> 2. If i am ready to sacrifice faceting, is it advisable to use > LongField > >>> for sorting and range query? > >>> > >> > >> Like said above you need doc values for sorting. > >> > >> > >>> 3. During sorting, Will NumericDocValuesField( column stride storage) > >>> perform better than LongField(trie structure)? If so , should i > duplicate > >>> field in both 1 and 2 cases? > >>> > >> > >> Same note here. > >> > >> Adrien > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > For additional commands, e-mail: java-user-help@lucene.apache.org > > > --001a11413ccc188def054451dcb0--