lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: Optimize and Out Of Memory Errors
Date Wed, 24 Dec 2008 14:20:23 GMT
We don't know those norms are "the" problem. Luke is loading norms if 
its searching that index. But what else is Luke doing? What else is your 
App doing? I suspect your app requires more RAM than Luke? How much RAM 
do you have and much are you allocating to the JVM?

The norms are not necessarily the problem you have to solve - but it 
would appear they are taking up over 2 gig of memory. Unless you have 
some to spare (and it sounds like you may not), it could be a good idea 
to turn them off for particular fields.

- Mark

Lebiram wrote:
> Is there away to not factor in norms data in scoring somehow?
>
> I'm just stumped as to how Luke is able to do a seach (with limit) on the docs but in
my code it just dies with OutOfMemory errors.
> How does Luke not allocate these norms?
>
>
>
>
> ________________________________
> From: Mark Miller <markrmiller@gmail.com>
> To: java-user@lucene.apache.org
> Sent: Tuesday, December 23, 2008 5:25:30 PM
> Subject: Re: Optimize and Out Of Memory Errors
>
> Mark Miller wrote:
>   
>> Lebiram wrote:
>>     
>>> Also, what are norms 
>>>       
>> Norms are a byte value per field stored in the index that is factored into the score.
Its used for length normalization (shorter documents = more important) and index time boosting.
If you want either of those, you need norms. When norms are loaded up into an IndexReader,
its loaded into a byte[maxdoc] array for each field - so even if one document out of 400 million
has a field, its still going to load byte[maxdoc] for that field (so a lot of wasted RAM).
 Did you say you had 400 million docs and 7 fields? Google says that would be:
>>
>>
>>    **400 million x 7 byte = 2 670.28809 megabytes**
>>
>> On top of your other RAM usage.
>>     
> Just to avoid confusion, that should really read a byte per document per field. If I
remember right, it gives 255 boost possibilities, limited to 25 with length normalization.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>       
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message