Daniel Herlitz wrote:
> Hi everybody,
>
> We have been using Lucene for about one year now with great success.
> Recently though the index has growed noticably and so has the number of
> searches. I was wondering if anyone would like to comment on these
> figures and say if it works for them?
>
> Index size: ~2.5 GB, on disk
> Number of fields: ~30
> Number of indexed fields: ~10
> Server: Linux, Intel(R) Xeon(TM) CPU 3.00GHz, 3GB, dedicated to Lucene
> searches.
> Java: Sun 1.5, -Xmx1200m
For perf tuning on 1.4+ VMs I always try these flags too:
-server
-XX:CompileThreshold=100
-Xverify:none
And also worth considering is giving a -Xms value equal to -Xmx.
> Load: Approaching 2000 requests / hour.
> Queries: The query strings are of highly differing complexity, from
> simple x:y to long queries involving conjunctions, disjunctions and
> wildecard queries.
>
> 90% of the queries run brilliantly. Problem is that 10% of the queries
> (simple or not) take a long time, on average more that 10 seconds,
> sometimes several minutes.
>
> We have managed to track down these figures to the calls to
> IndexSearcher.search(Query). We have seen up to about 10 searches
> concurrently executing.
>
> We have tried to run the server on different machines and with different
> version of Java. We have no OutOfMemorys.
>
> I am curious about what to expect from Lucene when it comes to
> searching. There are lots of figures about the indexing speed (no
> question about that, it's incredibly fast!). But what about searching?
> And searching with the kind of load we have. Anyone in the same
> situation as we are? Comments? Suggestions?
Well in a benchmark I was doing recently fuzzy queries were the problem
in the mix I had - but to be fair, a fuzzy search is really just a big
query as it expands query to be all "similar" terms.
Also of interest is what's the problem w/ the long running queries - are
they slowing down the response time for the other users w/ shorter
queries?
I've never done this, but you could consider a thread pool to execute
the queries, and once a query takes more than, say, a second, you lower
its priority.
Also, I'd have a rule like no more than "n" slow queries can run at
once, so you queue up slow queries if there are lots of them executing.
>
> Thanks
> Daniel
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
|