Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7FE0411D97 for ; Tue, 8 Jul 2014 06:50:36 +0000 (UTC) Received: (qmail 54378 invoked by uid 500); 8 Jul 2014 06:50:34 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 54320 invoked by uid 500); 8 Jul 2014 06:50:34 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 54306 invoked by uid 99); 8 Jul 2014 06:50:34 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Jul 2014 06:50:34 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sandeep_khanzode@yahoo.com designates 98.138.91.92 as permitted sender) Received: from [98.138.91.92] (HELO nm16-vm2.bullet.mail.ne1.yahoo.com) (98.138.91.92) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Jul 2014 06:50:26 +0000 Received: from [98.138.100.116] by nm16.bullet.mail.ne1.yahoo.com with NNFMP; 08 Jul 2014 06:50:05 -0000 Received: from [98.138.101.164] by tm107.bullet.mail.ne1.yahoo.com with NNFMP; 08 Jul 2014 06:50:05 -0000 Received: from [127.0.0.1] by omp1075.mail.ne1.yahoo.com with NNFMP; 08 Jul 2014 06:50:05 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 462187.16677.bm@omp1075.mail.ne1.yahoo.com Received: (qmail 39407 invoked by uid 60001); 8 Jul 2014 06:50:05 -0000 X-YMail-OSG: jrodsigVM1ltozx3_DfK_hrhKcCsckaloaeHhISF9p0If7f Bh3ClegayMYGXDe98uT2in6WD.S3Wj2P2wbMiDWSLip4nVNtejJ.CRWzjvLc KkuB8QWTcY4mCg4LkigDoYDbCLFE7QLFND03nl26gM4QZ4eSlghliNTRjDfk R1hfMAVLHrc4_gHSNbTQNDv2WRG.IUT6soMnaDoflPllXIbwWd5CZJDmwcw6 wanvdK_dUKokL5.aSwKjVxdZeHoP9y_s9goyM1PsRT01cV_DdkOx2jH4zRHz MYAQElgTQBHNnmq6B77eMH4kBNH4EmO0dqpoY.Yz60gWUTTHEW4iItLxmxSY rb4brh840K4nl08mMOaTlOeJmdY_oVN2peRX06TLh.uxR_r8hmxviTmIIrWC G7Ocrq2RzvJ7BL0GvMyxndVrejVuJSVNEc5WDjsHMrabjK4MP_I9S2nOmH4g rYGp1f20bfn2cz5et.vwnrLDtFFqDIxkwkjfER0jubY5JdhyNZxoCZnDS01L AR7oh4u.3aLxgRpahcUhgZR9NuVXJTDRCOeesI2usfmYINjpahqF8uAswlvE 8ezn16s4fLDY- Received: from [199.43.186.25] by web121305.mail.ne1.yahoo.com via HTTP; Mon, 07 Jul 2014 23:50:05 PDT X-Rocket-MIMEInfo: 002.001,SGksCsKgCkkgYW0gdXNpbmcgTHVjZW5lIDQuNy4yIGFuZCBteSBwcmltYXJ5IHVzZSBjYXNlIGZvciBMdWNlbmUgaXMgdG8gZG8gdGhyZWUgdGhpbmdzOiAoYSkgc2VhcmNoLCAoYikgc29ydCBieSBhIG51bWJlciBvZiBmaWVsZHMgZm9yIHRoZSBzZWFyY2ggcmVzdWx0cywgYW5kIChjKSBmYWNldCBvbiBwcm9iYWJseSBhbiBlcXVhbCBudW1iZXIgb2YgZmllbGRzIChwcm9iYWJseSB0aGUgbW9zdCBzdGFuZGFyZCB1c2UgY2FzZXMgYW55d2F5KS4KCkxldCB1cyBzYXksIEkgaGF2ZSBhIGNvcnB1cyBvZiBtb3IBMAEBAQE- X-Mailer: YahooMailWebService/0.8.191.1 Message-ID: <1404802205.40506.YahooMailNeo@web121305.mail.ne1.yahoo.com> Date: Mon, 7 Jul 2014 23:50:05 -0700 From: Sandeep Khanzode Reply-To: Sandeep Khanzode Subject: Sort, Search & Facets To: Lucene Users MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="1521818054-436089440-1404802205=:40506" X-Virus-Checked: Checked by ClamAV on apache.org --1521818054-436089440-1404802205=:40506 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Hi,=0A=A0=0AI am using Lucene 4.7.2 and my primary use case for Lucene is t= o do three things: (a) search, (b) sort by a number of fields for the searc= h results, and (c) facet on probably an equal number of fields (probably th= e most standard use cases anyway).=0A=0ALet us say, I have a corpus of more= than a 100m docs with each document having approx. 10-15 fields excluding = the content (body) which will also be one of the fields. Out of 10-15, I ha= ve a requirement to have sorting enabled on all 10-15 and the facets as wel= l. That makes a total of approx. ~45 fields to be indexed for various reaso= ns, once for String/Long/TextField, once for SortedDocValuesField, and once= for FacetField each.=A0=0A=0AWhat will be the impact of this on the indexi= ng operation w.r.t. the time taken as well as the extra disk space required= ? Will it grow linearly with the increase in the number of fields?=0A=0AWha= t is the impact on the memory usage during search time?=0A=0A=0AI will atte= mpt to benchmark some of these, but if you have any experience with this, r= equest you to share the details. Thanks,=0A=0A-----------------------=0ATha= nks n Regards,=0ASandeep Ramesh Khanzode --1521818054-436089440-1404802205=:40506--