lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <>
Subject Re: Speeding up looping over Hits
Date Thu, 22 Mar 2007 14:04:05 GMT
Your timing differences are probably because of caching. But this
has been mentioned many times in the archive, that a Hits object
is intended to allow fast, simple retrieval of the first few documents
in a result set (100 if memory serves). Each 100 or so calls to
next() causes the search to be re-issued.

See HitCollector, TopDocs, etc...


On 3/22/07, Andreas Guther <> wrote:
> Hi,
> While looking into performance enhancement for our search feature I
> noticed a significant difference in Documents access time while looping
> over Hits.
> I wrote a test application search for a list of search terms and then
> for each returned Hits object loops twice over every single hits.doc(i).
> for (int i = 0; i < numberOfDocs; i++) {doc = hits.doc(i);}
> I am seeing differences like the following
> Found 16,215 hits for 'Water or Wine' in 219 ms
> Processed 16,215 docs in 53,141 ms; per single doc 3.2773 ms
> Processed 16,215 docs in 2,032 ms; per single doc 0.1253 ms
> Interestingly if I run the same test application a second time in my IDE
> the difference between the first and the second loop is very low.
> I have no explanation why I see this difference but it becomes a huge
> problem for us due to the fact that I need to extract from each document
> a small set of information pieces and the first time looping just takes
> too much time.
> I could not find any indication for an external caching of Hits.  I am
> running my tests within Eclipse with a memory setting of -Xms766M
> -Xmx1024M.
> What is the explanation in the different access speed for the same
> search results?
> Is there a way to speed up looping over the Hits data structure?
> Andreas
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message