lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chuck Williams <>
Subject Re: Benchmarking on GOV2
Date Tue, 30 May 2006 00:35:40 GMT

Sebastiano Vigna wrote on 05/28/2006 10:39 PM:
> but we will certainly need
> some help to configure Lucene so that it works at its best.
> We would like to measure indexing time and query answer time

I'm not sure what form you would like that help to take, but here are a
couple high-level points imho:

   1. Be sure a single jvm process is running to do all the benchmarks. 
      There have been many bogus lucene benchmarks created by using
      separate command-line java invocations for each operation.
   2. Don't use Hits-based search operators if you want anything other
      than exactly 50 results (50 is, surprisingly, a magic number
      hardwired into hits).  It appears the paper referenced elsewhere
      on this thread looked at recall and precision over a much larger
      result set.  Use a HitCollector with a TopDocs orTopFieldDocs to
      collect the number of results you want without redoing the search
      a bunch of times unnecessarily.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message