Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 31939 invoked from network); 11 Mar 2011 06:10:27 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 11 Mar 2011 06:10:27 -0000 Received: (qmail 71404 invoked by uid 500); 11 Mar 2011 06:10:25 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 71344 invoked by uid 500); 11 Mar 2011 06:10:24 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 71334 invoked by uid 99); 11 Mar 2011 06:10:24 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Mar 2011 06:10:24 +0000 X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests=MSGID_MULTIPLE_AT,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of suman.holani@zapak.co.in designates 220.226.181.11 as permitted sender) Received: from [220.226.181.11] (HELO mail.zapak.co.in) (220.226.181.11) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 11 Mar 2011 06:10:17 +0000 Received: (qmail 28441 invoked by uid 508); 11 Mar 2011 06:09:54 -0000 Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=zapak.co.in; b=dqvDN35dFHzVIPl5/dkv1AxTy/OUDiOTXk5cUbRjvm6KKblHrUv1TUyRKSzKGGMX ; Received: from unknown (HELO wksinfi04d899) (suman.holani@zapak.co.in@192.168.2.20) by mail.zapak.co.in with ESMTPA; 11 Mar 2011 06:09:54 -0000 From: "suman.holani" To: References: <7293228395167128935@unknownmsgid> <5262941836649978873@unknownmsgid> In-Reply-To: Subject: RE: document object Date: Fri, 11 Mar 2011 11:34:53 +0530 Message-ID: <002001cbdfb2$3d846290$b88d27b0$@holani@zapak.co.in> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: AcvfLbcudyesKNF4S7+jhT2+xhGJNQAfxVbQ Content-Language: en-us X-Virus-Checked: Checked by ClamAV on apache.org Hello Erick, Hits .length is 1800=20 Version is lucene 3.0.3=20 I need the entire result set . As I ll be fetching records which satisfy = the search conditions. And will be validating them wrt to current counts , scheduling the successful resultset.Selecting one of them on basis of = random scheduling. I cannot take page wise result. As that will lead to starvation of = documents which are at end. I cannot add validating current counts onto index as it is changing v frequently. So not possible to change entire index everytime for that. Let me know of some soln . Let say there are 5 fields in indexing . A, B C ,D ,E when I search 1000 records are fetched I wanna use A, D for the time being for validating the records wrt = counts. Note:fields B,C,E is nt required now, bt I am fetching it and storing in = a list A,D in list are given to another process for validation After validation 700 records are in list Of wchich one of the record displayed after scheduling with entire = fields A, B,C,D,E Regards, Suman =20 -----Original Message----- From: Erick Erickson [mailto:erickerickson@gmail.com]=20 Sent: Thursday, March 10, 2011 7:46 PM To: java-user@lucene.apache.org Subject: Re: document object If you're loading 100,000 documents, you can expect it to be slow. If you're loading 10 documents, it should be quite fast... So how big is hits.length? And what version of Lucene are you using? The Hits object has been deprecated for quite some time I believe..... The problem here is that you're loading the entire result set. This is rarely the right thing to do, which is why paging is used normally. Why do you need to load the entire result set? That seems to be the crux of the issue. Best Erick On Thu, Mar 10, 2011 at 5:22 AM, Anshum wrote: > Depends on your data. I know that's a vague answer but that's the = point. > What you could do is use FieldCache if memory and data let you do so. Would > it? > > -- > Anshum Gupta > http://ai-cafe.blogspot.com > > > On Thu, Mar 10, 2011 at 3:12 PM, suman.holani wrote: > >> Hi Anshum, >> >> Thanks for prompt reply. >> >> I am only storing the fields in index , which I want to get/fetch = after >> search. >> >> The area I am not sure is when we call searcher/reader class to initialize >> Document object is heavy? >> Can we use something else in that place, which doesnot needs to load = all >> doc >> again. >> >> Regards, >> Suman >> >> >> -----Original Message----- >> From: Anshum [mailto:anshumg@gmail.com] >> Sent: Thursday, March 10, 2011 3:11 PM >> To: java-user@lucene.apache.org >> Subject: Re: document object >> >> Hi Suman, >> Do you need to load/use all fields that you have stored in the index? = If >> that's not the case I'd suggest you to use the >> >> >> public Document >> < >> http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/document/D= oc >> ument.html> >> *doc*(int i, FieldSelector fieldSelector) >> >> >> http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/search/Ind= exS >> earcher.html#doc(int, >> org.apache.lucene.document.FieldSelector) >> < >> http://lucene.apache.org/java/3_0_1/api/core/org/apache/lucene/search/Ind= ex >> Searcher.html#doc(int, >> org.apache.lucene.document.FieldSelector)>function . >> This should help you. Also, otherwise if you're using very selective field >> which may be used though a FieldCache it'd be a nice thing to do. >> >> Hope that helps. >> -- >> Anshum Gupta >> http://ai-cafe.blogspot.com >> >> >> On Thu, Mar 10, 2011 at 3:01 PM, suman.holani >> wrote: >> >> > >> > >> > Hi, >> > >> > >> > >> > I am facing the =A0problem >> > >> > >> > >> > The line in the loop is going very slow giving me a performance hit >> > >> > =A0for (int i =3D 0; i < hits.length; ++i) { >> > >> > >> > >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0int docId =3D hits[i].doc; >> > >> > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Document d =3D searcher.doc(docId); = =A0//problem >> > >> > } >> > >> > >> > >> > How can I improve this. Please give me an example of the improved = code >> > >> > >> > >> > Thanks, >> > >> > Suman >> > >> > >> > >> > >> > >> > Ps : >> > >> > In one of post Erick said .. >> > >> > >> > >> > this line is really suspicious: >> > >> > Document document =3D this.indexReader.document(doc) >> > >> > From the Javadoc for HitCollector.collect: >> > >> > Note: This is called in an inner search loop. For good search >> performance, >> > implementations of this method should not call >> > >> > >> >> Searcher.doc(int)> > Searcher.html#doc%28int%29>or >> > >> > >> >> IndexReader.document(int)> > /index/IndexReader.html#document%28int%29>on >> > every document number encountered. Doing so can slow searches by an >> > order >> > of magnitude or more. >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org