lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andreas Guther" <Andreas.Gut...@markettools.com>
Subject Speeding up looping over Hits
Date Thu, 22 Mar 2007 13:58:54 GMT
Hi,

While looking into performance enhancement for our search feature I
noticed a significant difference in Documents access time while looping
over Hits.

I wrote a test application search for a list of search terms and then
for each returned Hits object loops twice over every single hits.doc(i).

for (int i = 0; i < numberOfDocs; i++) {doc = hits.doc(i);}

I am seeing differences like the following

Found 16,215 hits for 'Water or Wine' in 219 ms
Processed 16,215 docs in 53,141 ms; per single doc 3.2773 ms
Processed 16,215 docs in 2,032 ms; per single doc 0.1253 ms

Interestingly if I run the same test application a second time in my IDE
the difference between the first and the second loop is very low.

I have no explanation why I see this difference but it becomes a huge
problem for us due to the fact that I need to extract from each document
a small set of information pieces and the first time looping just takes
too much time.

I could not find any indication for an external caching of Hits.  I am
running my tests within Eclipse with a memory setting of -Xms766M
-Xmx1024M.

What is the explanation in the different access speed for the same
search results?

Is there a way to speed up looping over the Hits data structure?

Andreas



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message