lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Smith <psm...@aconex.com>
Subject Re: Multisearcher will maintain index order sorting?
Date Thu, 23 Oct 2008 05:27:06 GMT

On 23/10/2008, at 4:20 PM, Ganesh wrote:

> My Index DB is having 10 million records and it will grow to 30  
> million. Currently I am using millisecond timestamp and the RAM  
> cosumption is more. I will change the resolution to minute. I am  
> using 2 searcher objects refreshing each other every minute. When i  
> do a warmup query with sort of timestamp then the cpu is spiked to  
> 100% and this is happening for every minute.  In order to avoid  
> these issues, i am planning to break my DB and to do sort on indexed  
> order.
>
> Will multisearcher will maintain indexed order on sorting?


If you need to keep the millisecond accuracy, break down the timestamp  
into 3 fields: day, time, millisecond, and sort on 3 fields.  This way  
each field has a much smaller number of distinct values and well  
occupy vastly less memory over time.  I don't think there's much  
overhead in this approach either, because in most cases, the top-level  
field (day) will provide most of the sorting ability, and Lucene will  
only need to hit the time & millisecond fields less frequently for  
comparison.

I believe Multisearcher does a merge sort of the 2 (or more) sub- 
searchers, so there is an overhead in using in versus a single searcher.

Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message