lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Lucene 4 single segment performance improvement tips?
Date Wed, 05 Mar 2014 12:25:21 GMT
What sorts of queries are you running?  It seems like they must be
very terms-dict intensive, e.g. primary key lookups or multi-term
queries, and maybe not matching too many documents?

It's strange you can't get CPU usage up, as you add threads.  Maybe
simplify the test to remove Jetty?  Ie, a standalone test just
invoking Lucene APIs directly using multiple threads.

Does the profiler reveal and hot locks, where threads are having to
wait to acquire the lock?

Mike McCandless

On Wed, Mar 5, 2014 at 4:18 AM, Arvind Kalyan <> wrote:
> Hi folks,
> We are currently using Lucene 4.5 and we are hitting some bottlenecks and
> appreciate some input from the community.
> This particular index (the disk size for which is about 10GB) is guaranteed
> to not have any updates, so we made it a single segment index by doing a
> forceMerge(1). The index is guaranteed to be in-memory as well: we use the
> MMapDirectory and the whole thing is mlocked after load. So there is no
> disk I/O.
> Our runtime/search use-case is very simple: run filters to select all docs
> that match some conditions specified in a filter query (we do not use
> Lucene scoring) and return the first 100 docs that match (this is an
> over-simplification)
> On a machine with nothing else running, we are unable to move the needle on
> CPU utilization to serve higher QPS. We see that most of the time is spent
> in BlockTreeTermsReader.FieldReader.iterator() when we run profiling tools
> to see where time is being spent. The CPU usage doesn't cross 30% (we have
> multiple threads one per each client connected over a Jetty connection all
> taken from a bounded thread-pool). We tried the usual suspects like
> tweaking size of the threadpool, changing some jvm parameters like newsize,
> heapsize, using cms for old gen, parnew for newgen, etc.
> Does anyone here any pointers or general suggestions on how we can get good
> performance out of Lucene 4.x? Specifically IndexSearcher performance
> improvements for large, single-segment, atomicreaders.
> I'll share more specifics if necessary but I'd like to hear from folks here
> what your experience has been and what you did to speed up your
> IndexSearchers to improve throughput *and/or* latency.
> Thanks!
> --
> Arvind Kalyan

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message