lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <>
Subject Re: Speeding up looping over Hits
Date Thu, 22 Mar 2007 20:34:02 GMT
Oh yeah.. By only loading the relevant fields, my query times
reduced by over 90%. I actually wrote that up on the mailing list if
you wanted to try to find it, but it took Andreas' message to
remind me...


On 3/22/07, Santa Clause <> wrote:
> Another thing you may want to look at is the newer version 2.1.0and  getFieldable. I
think that will lazy load the data, that way you
> are  only reading the parts of the document that you need at that
> moment  rather than the whole thing. Someone please correct me if I am wrong
> or  point to what I really mean :)
>   I had a similar situation a long while back and I was able to find
> a  patch for the version of Lucene I was using that allowed the above.
> It  made a huge difference. I think something similar is now built in
> 2.1.0.
> Andreas Guther <> wrote:  Hi,
> While looking into performance enhancement for our search feature I
> noticed a significant difference in Documents access time while looping
> over Hits.
> I wrote a test application search for a list of search terms and then
> for each returned Hits object loops twice over every single hits.doc(i).
> for (int i = 0; i < numberOfDocs; i++) {doc = hits.doc(i);}
> I am seeing differences like the following
> Found 16,215 hits for 'Water or Wine' in 219 ms
> Processed 16,215 docs in 53,141 ms; per single doc 3.2773 ms
> Processed 16,215 docs in 2,032 ms; per single doc 0.1253 ms
> Interestingly if I run the same test application a second time in my IDE
> the difference between the first and the second loop is very low.
> I have no explanation why I see this difference but it becomes a huge
> problem for us due to the fact that I need to extract from each document
> a small set of information pieces and the first time looping just takes
> too much time.
> I could not find any indication for an external caching of Hits.  I am
> running my tests within Eclipse with a memory setting of -Xms766M
> -Xmx1024M.
> What is the explanation in the different access speed for the same
> search results?
> Is there a way to speed up looping over the Hits data structure?
> Andreas
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:
> ---------------------------------
> Now that's room service! Choose from over 150,000 hotels
> in 45,000 destinations on Yahoo! Travel to find your fit.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message