lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Heng Mei" <heng....@gmail.com>
Subject search performance degrades by order of magnitude when using SortField.
Date Mon, 29 May 2006 04:57:45 GMT
Hi Lucene experts,

I'm having a problem with Sort performance during searches.  I'm using
Lucene 1.9.1.

I need to Sort by a date field in the document.  When I use the
default Sort.RELEVANCE, query response time is ~6ms.  However, when I
specify a sort, e.g. Searcher.search( query, new Sort( "mydatefield" )
 ), the query response time gets multiplied by a factor of 10 or 20.
Also, CPU usage shoots up to nearly 90%.   Is this expected behavior?
I thought the default sort and sort by field should perform roughly
the same when the values are cached in memory, since they both have to
do a top-K ranking over the same number of raw hits.   The performance
gets disproportionately worse as I increase the number of parallel
threads that query the same Searcher object.

Also, in my previous experience with sorting by a field in Lucene, I
seem to remember there being a preload time when you first search with
a sort by field, sometimes taking 30 seconds or so to load all of the
field's values into the in-memory cache associated with the Searcher
object.  This initial preload time doesn't seem to be happening in my
case -- does that mean that for some reason Lucene is not caching the
field values?

I have an index of 1 million documents, taking up about 1.7G of
diskspace.  I specify -Xmx2000m when running my java search
application.

Any advice or insight would be much appreciated.

Thanks,
~Heng

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message