lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Spencer <dave-lucene-u...@tropo.com>
Subject Re: Search performance under high load
Date Wed, 06 Apr 2005 23:11:28 GMT
Daniel Herlitz wrote:

> Hi everybody,
> 
> We have been using Lucene for about one year now with great success. 
> Recently though the index has growed noticably and so has the number of 
> searches. I was wondering if anyone would like to comment on these 
> figures and say if it works for them?
> 
> Index size: ~2.5 GB, on disk
> Number of fields: ~30
> Number of indexed fields: ~10
> Server: Linux, Intel(R) Xeon(TM) CPU 3.00GHz, 3GB, dedicated to Lucene 
> searches.
> Java: Sun 1.5, -Xmx1200m

For perf tuning on 1.4+ VMs I always try these flags too:

-server
-XX:CompileThreshold=100
-Xverify:none

And also worth considering is giving a -Xms value equal to -Xmx.



> Load: Approaching 2000 requests / hour.
> Queries: The query strings are of highly differing complexity, from 
> simple x:y to long queries involving conjunctions, disjunctions and 
> wildecard queries.
> 
> 90% of the queries run brilliantly. Problem is that 10% of the queries 
> (simple or not) take a long time, on average more that 10 seconds, 
> sometimes several minutes.
> 
> We have managed to track down these figures to the calls to 
> IndexSearcher.search(Query). We have seen up to about 10 searches 
> concurrently executing.
> 
> We have tried to run the server on different machines and with different 
> version of Java. We have no OutOfMemorys.
> 
> I am curious about what to expect from Lucene when it comes to 
> searching. There are lots of figures about the indexing speed (no 
> question about that, it's incredibly fast!). But what about searching? 
> And searching with the kind of load we have. Anyone in the same 
> situation as we are? Comments? Suggestions?

Well in a benchmark I was doing recently fuzzy queries were the problem 
in the mix I had - but to be fair, a fuzzy search is really just a big 
query as it expands query to be all "similar" terms.

Also of interest is what's the problem w/ the long running queries - are 
they slowing down the response time for the other users w/ shorter 
queries?

I've never done this, but you could consider a thread pool to execute 
the queries, and once a query takes more than, say, a second, you lower 
its priority.

Also, I'd have a rule like no more than "n" slow queries can run at 
once, so you queue up slow queries if there are lots of them executing.



> 
> Thanks
> Daniel
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message