lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject ANN: Lucene benchmark tool
Date Tue, 07 Dec 2004 01:18:03 GMT
Hi there,

After recent discussions on the speed of indexing/searching using 
different parameters it became even clearer that we need a comprehensive 
and repeatable benchmark.

I created a class which represents my first hack at benchmarking various 
aspects of Lucene, using a range of different parameters. Since it uses 
a standard, well-defined document collection, I hope that its results 
should be more or less meaningful across different OS/hardware combinations.

I had a look at JUnitPerf, but found the API to be too limited for 
collecting complex time-series data, so I basically rolled my own 
benchmarking framework... If you know a better way to do it, I'm all ears.

I'm going to package it into a self-running application (WebStart?), but 
for now you can try to compile and run it yourself. You can get it here:

It depends on the commons-compress.jar, specifically on the Tar 
functionality. This JAR is in commons-sandbox, so it may not be readily 
available - in that case you can get it here:

(I will put an index page there, but for now use these direct links).

CAVEAT: please NOTE WELL that this benchmark runs at 100% CPU and 100% 
disk I/O for SEVERAL HOURS even on a modern equipment (partial results 
are printed on System.out from time to time). You have been warned - so 
don't send me any fried mobo's or melted drives for repairs, ok?

You can cut down the number of input parameters to reduce the overall 
time, or use the mini* document collection (but this reduces the number 
of documents in index). See the comments in source.

Comments and patches are welcome!

Best regards,
Andrzej Bialecki

Software Architect, System Integration Specialist
CEN/ISSS EC Workshop, ECIMF project chair
EU FP6 E-Commerce Expert/Evaluator
FreeBSD developer (

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message