lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <>
Subject Search benchmark: 2.0 vs. 2.2-dev and heap sizing
Date Tue, 06 Mar 2007 16:16:16 GMT

I'm doing some Lucene search benchmarking (got to love massive query logs :)) and have 2 questions:

1) Has anyone compared Lucene 2.0 and 2.2-dev?  My benchmarks found 2.2-dev (freshly baked)
to be somewhat slower than 2.0, despite all those performance improvements (see CHANGES.txt)...
Has anyone else done the comparison?  My queries are a mixture of 2-3 required keywords (majority)
and phrase queries with 2-3 keywords.

To give you an idea about how much slower 2.2-dev is for me, here are some counts for queries
I considered slow (> 1s latency) during my benchmark with 8 concurrent search threads and
then 64 threads:

$ grep -c SLOW 5-shard-log-2.0/8.log 
$ grep -c SLOW 5-shard-log-2.2-dev/8.log 

$ grep -c SLOW 5-shard-log-2.0/64.log 
$ grep -c SLOW 5-shard-log-2.2-dev/64.log 

This is of a total of 100K queries.

2) My benchmark was against 5 optimized compound Lucene indices, about 9GB each, on a box
with 32GB of RAM and several CPUs.  I gave the JVM 22GB with Xms and Xmx.  However, I am wondering
if giving it that much is actually smart.  While I'm letting JVM use more RAM, I'm taking
it away from the OS for FS caching.  So, I'm now thinking about running the same benchmark,
but with a smaller max heap.  But how much should I give it?  I'm thinking about adding up
sizes of all .tii files, adding some padding for the JVM, GC, etc., and using that.  Is there
anything else I should consider here?

So I looked at one of the .cfs files:

_0.f0: 11164467 bytes
... other fields, same size, of course
_0.fdt: 381343723 bytes
_0.fdx: 89315736 bytes
_0.fnm: 78 bytes
_0.frq: 4591955197 bytes
_0.prx: 4242807266 bytes
_0.tii: 11498861 bytes
_0.tis: 829868070 bytes

Here, the .tii file is only about 11 MB.  That looks awfully small!  There is no way 5 x 11
MB + padding will be enough.  Should I be adding the size of some other file(s)?  .tis perhaps?


. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy --  -  Tag  -  Search  -  Share

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message