lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ganesh" <emailg...@yahoo.co.in>
Subject Re: Alternative way to simulate sorting without doing actual sort
Date Wed, 22 Jul 2009 09:31:21 GMT
Hello Eric,

Thanks for your reply.

Memory reqd for sorting: 4 * reader.maxdoc() 
.
I am sorting datetime with minute resolution. 100 records are representing a minute then in
a 1 million record database, there will be around 20000 unique terms. the amount of memory
consumed would be 4 * 1000000 + 20000 * 8 [Considering date time as Long]

The more amount of memory consumed by 4 * reader.maxdoc. If i have two or three fields say
 (YYYMMDD, hh, mm) then the amount of memory consumption would be too high. How could you
say that splitting the field will help in reducing the memory usage.

Please correct me if i am wrong. I require some justification to split the date in to multiple
terms.

Regards
Ganesh
 

----- Original Message ----- 
From: "Erick Erickson" <erickerickson@gmail.com>
To: <java-user@lucene.apache.org>
Sent: Tuesday, July 21, 2009 7:29 PM
Subject: Re: Alternative way to simulate sorting without doing actual sort


> Have you tried splitting your times into separate fields, perhaps one with
> YYYYMMDD and another with HHMM, then do a primary sort on the YYYMMDD and
> secondary on HHMM. That'll reduce your total unique values greatly and
> should improve your memory consumption.
> Best
> Erick
> 
> On Tue, Jul 21, 2009 at 4:27 AM, Ganesh <emailgane@yahoo.co.in> wrote:
> 
>> Hello all
>>
>> I am sorting on datetime with minute resolution. It easily reaches the
>> maximum heap size. I am having almost 100M records and it is using 1.5 GB. I
>> am now in a situitation to stop sorting and to find some other alternative
>> way.
>>
>> I tried adding document boost and field boost for date time. document boost
>> alone is not working. document boost and field boost has impact on score.
>> Search on datetime gives me the sorted datetime results but search on any
>> other field didn't works.
>>
>> I am doing updates and it changes the doc id.. I want to get the results
>> sorted by FIRST TIME inserted order. Updates should not disturb the results
>> set. I think Solr has some facilities to get the list of recently added
>> documents.
>>
>> Any ideas are greatly appreciated.
>>
>> Regards
>> GaneshSend instant messages to your online friends
>> http://in.messenger.yahoo.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
Send instant messages to your online friends http://in.messenger.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message