lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Toke Eskildsen>
Subject Re: A lucene performance test
Date Tue, 14 Aug 2012 09:36:38 GMT
On Mon, 2012-08-13 at 17:36 +0200, Stefan Trcek wrote:
> * There must be 2 "caches" in the system: The performance degrades
>   significantly beyond 1000 and again beyond 200000 documents
>   in the index.

With 1000 records, everything is in level 2 cache and a lot in level 1,
which makes it blazingly fast. The index at 200K is 13MB, which means
that half of it fits in your level 2 cache. The next step 500K is 32MB,
so your cache only holds 1/5 of it.

In a real setup, with all the other processes competing for level 2
cache, performance will likely be markedly lower.

> * JIT is very significant
> * Index start up is very significant allthough the index is in the
>   IO cache and the query is the most simple.

As some internal Lucene structures are initialized upon first search, it
is generally advisable to discard the results from the first few

> * The performance of index size 20000000 is strange.
>   Worse performance than 50000000

When I ran your test (results attached), the index at 20M had 76 files
and the index at 50M had 46 files and I got the same slowdown at 20M as
you did. More segments = more merge overhead.

Thank you for sharing your test & measurements,
Toke Eskildsen

View raw message