Hi Lucene experts,
I'm having a problem with Sort performance during searches. I'm using
Lucene 1.9.1.
I need to Sort by a date field in the document. When I use the
default Sort.RELEVANCE, query response time is ~6ms. However, when I
specify a sort, e.g. Searcher.search( query, new Sort( "mydatefield" )
), the query response time gets multiplied by a factor of 10 or 20.
Also, CPU usage shoots up to nearly 90%. Is this expected behavior?
I thought the default sort and sort by field should perform roughly
the same when the values are cached in memory, since they both have to
do a top-K ranking over the same number of raw hits. The performance
gets disproportionately worse as I increase the number of parallel
threads that query the same Searcher object.
Also, in my previous experience with sorting by a field in Lucene, I
seem to remember there being a preload time when you first search with
a sort by field, sometimes taking 30 seconds or so to load all of the
field's values into the in-memory cache associated with the Searcher
object. This initial preload time doesn't seem to be happening in my
case -- does that mean that for some reason Lucene is not caching the
field values?
I have an index of 1 million documents, taking up about 1.7G of
diskspace. I specify -Xmx2000m when running my java search
application.
Any advice or insight would be much appreciated.
Thanks,
~Heng
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
|