lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Toke Eskildsen ...@statsbiblioteket.dk>
Subject Re: A lucene performance test
Date Tue, 14 Aug 2012 09:36:38 GMT
On Mon, 2012-08-13 at 17:36 +0200, Stefan Trcek wrote:
> * There must be 2 "caches" in the system: The performance degrades
>   significantly beyond 1000 and again beyond 200000 documents
>   in the index.

With 1000 records, everything is in level 2 cache and a lot in level 1,
which makes it blazingly fast. The index at 200K is 13MB, which means
that half of it fits in your level 2 cache. The next step 500K is 32MB,
so your cache only holds 1/5 of it.

In a real setup, with all the other processes competing for level 2
cache, performance will likely be markedly lower.

> * JIT is very significant
> 
> * Index start up is very significant allthough the index is in the
>   IO cache and the query is the most simple.

As some internal Lucene structures are initialized upon first search, it
is generally advisable to discard the results from the first few
searches. 

> * The performance of index size 20000000 is strange.
>   Worse performance than 50000000

When I ran your test (results attached), the index at 20M had 76 files
and the index at 50M had 46 files and I got the same slowdown at 20M as
you did. More segments = more merge overhead.

Thank you for sharing your test & measurements,
Toke Eskildsen

Mime
View raw message