lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arvind Kalyan <>
Subject Re: Lucene 4 single segment performance improvement tips?
Date Wed, 05 Mar 2014 21:53:21 GMT
On Wed, Mar 5, 2014 at 8:14 AM, Chris Hostetter <>wrote:

> : Our runtime/search use-case is very simple: run filters to select all
> docs
> : that match some conditions specified in a filter query (we do not use
> : Lucene scoring) and return the first 100 docs that match (this is an
> : over-simplification)
> "first" as defined how? in order collected by a custom collector, or via
> some sort?

We sort the docs in some order and freeze the single segment index.

> : On a machine with nothing else running, we are unable to move the needle
> on
> : CPU utilization to serve higher QPS. We see that most of the time is
> spent
> : in BlockTreeTermsReader.FieldReader.iterator() when we run profiling
> tools
> : to see where time is being spent. The CPU usage doesn't cross 30% (we
> have
> : multiple threads one per each client connected over a Jetty connection
> all
> : taken from a bounded thread-pool). We tried the usual suspects like
> : tweaking size of the threadpool, changing some jvm parameters like
> newsize,
> : heapsize, using cms for old gen, parnew for newgen, etc.
> You said you have one thread per client, but you didn't mention anything
> about varying the number of clients -- did you try increasing the number
> of clients hitting your application concurrently?  It's possible that your
> box is "beefy" enough that 30% of the available CPU is all that's needed
> for the number of active concurrnt threads you are using (increasing hte
> size of the threadpool isn't going to affect anything if there aren't more
> clients utilizing those threads)

Yes, the number of threads is bounded (varied this to see how things
change), and we increased the qps from the client side. The client requests
essentially pile up and do not go beyond 300qps. The fact that we are
unable to go beyond that qps and still not utilize more than 30% cpu is
what's concerning. There are no monitors/locks that come up during
profiling, too. Only the ReferenceQueue.poll() <>
comes up. There's still enough memory available in the heap (allocated 6gb
heap, 2 gb newgen, 1:8 survivor ratio, 70% cms threshold, parnew gc) and
the machine has 64GB RAM.

I'm going to repeat the experiment with just Lucene (and no jetty) as what
Mike suggested but meanwhile if any of you have any other pointers it'd be


Arvind Kalyan

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message