lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Toke Eskildsen>
Subject RE: The performance of lucene searching(web entironment) test
Date Fri, 13 Jun 2008 06:59:27 GMT
On Wed, 2008-06-11 at 18:56 +0800, lutan wrote:
> Yes ,I have test again with same entironment but to use singleton 
> IndexSearcher.the  performance has increased. there 100 concurrent
> user request use different keyword ,and get 60 TPS(2 TPS before).
> and now the bottleneck  seem to be CPU,and the CPU using approach 
> 100%.and both RAM(using 70MB average), HD using as normal.

It sounds like you have found the solution to your immediate problem.

> Could I consider that as long as I have a larger capacity RAM ,and I 
> will get a good performance.

Depends on your index-size (in bytes). When your index grows, less and
less of it can fit in the disk-cache and more time will be required for
proper warm-up. But the change will happen gradually, so you'll only be
surprised if you suddenly increase your index-size to double or more

> I don't understand  " for disk-cache" meaning  very  clear.Could you please
> explain it again.Thanks a lot!(does't cache on RAM?)
> does warm-up  ==  cache?

There are (at least) two important memory mechanisms to consider.
My apologies if some of this is basic knowledge to you:

1) Disk-cache.
In general, the free RAM on your Linux-system is used for disk-cache.
With an index-size of 3GB and (just a guess) 1 GB free RAM, the
operating system is able to cache 1/3 or less of your index. If you open
the same index several times in a row, the disk-cache will be warmed to
the relevant parts of your index, so that you're not even hitting the
disk after a while. At least not for opening. This is the effect you
observed with your non-singleton based test, where the speed increased
slowly up to a not-so-high level.

2) Lucene internal structures.
I don't know much about this, so I hope somebody will correct me if I
make mistakes: Lucene has some internal structures that are initialized
when searches are performed. Depending on setup, this initialization can
be quite heavy (custom search for example). Performing warm-up, such as
searching with previously logged queries, will initialize these
structures before the real queries are received. This is the effect you
observed with your singleton searcher.

1 & 2 can be seen in combination, as the initialization of the internal
structures in Lucene requires a fair amount of seeks in the index data.
If there's nothing in the disk-cache and a conventional platter-based
harddisk is used, it takes some time. If the disk-cache is warmed from
previous use or a solid state drive setup is used, it is much faster.

>  how many docs do lucene will be cached default?and could I control the
>  cache size?

I don't know. Maybe someone else will chime in?

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message