lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Santa Clause <noclu...@yahoo.com>
Subject Re: Speeding up looping over Hits
Date Thu, 22 Mar 2007 20:23:09 GMT
Another thing you may want to look at is the newer version 2.1.0 and  getFieldable. I think
that will lazy load the data, that way you are  only reading the parts of the document that
you need at that moment  rather than the whole thing. Someone please correct me if I am wrong
or  point to what I really mean :) 
  
  I had a similar situation a long while back and I was able to find a  patch for the version
of Lucene I was using that allowed the above. It  made a huge difference. I think something
similar is now built in 2.1.0.
  

Andreas Guther <Andreas.Guther@markettools.com> wrote:  Hi,

While looking into performance enhancement for our search feature I
noticed a significant difference in Documents access time while looping
over Hits.

I wrote a test application search for a list of search terms and then
for each returned Hits object loops twice over every single hits.doc(i).

for (int i = 0; i < numberOfDocs; i++) {doc = hits.doc(i);}

I am seeing differences like the following

Found 16,215 hits for 'Water or Wine' in 219 ms
Processed 16,215 docs in 53,141 ms; per single doc 3.2773 ms
Processed 16,215 docs in 2,032 ms; per single doc 0.1253 ms

Interestingly if I run the same test application a second time in my IDE
the difference between the first and the second loop is very low.

I have no explanation why I see this difference but it becomes a huge
problem for us due to the fact that I need to extract from each document
a small set of information pieces and the first time looping just takes
too much time.

I could not find any indication for an external caching of Hits.  I am
running my tests within Eclipse with a memory setting of -Xms766M
-Xmx1024M.

What is the explanation in the different access speed for the same
search results?

Is there a way to speed up looping over the Hits data structure?

Andreas



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



 
---------------------------------
Now that's room service! Choose from over 150,000 hotels 
in 45,000 destinations on Yahoo! Travel to find your fit.
Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message