lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <>
Subject Re: Optimize and Out Of Memory Errors
Date Tue, 23 Dec 2008 17:25:30 GMT
Mark Miller wrote:
> Lebiram wrote:
>> Also, what are norms 
> Norms are a byte value per field stored in the index that is factored 
> into the score. Its used for length normalization (shorter documents = 
> more important) and index time boosting. If you want either of those, 
> you need norms. When norms are loaded up into an IndexReader, its 
> loaded into a byte[maxdoc] array for each field - so even if one 
> document out of 400 million has a field, its still going to load 
> byte[maxdoc] for that field (so a lot of wasted RAM).  Did you say you 
> had 400 million docs and 7 fields? Google says that would be:
>    **400 million x 7 byte = 2 670.28809 megabytes**
> On top of your other RAM usage.
Just to avoid confusion, that should really read a byte per document per 
field. If I remember right, it gives 255 boost possibilities, limited to 
25 with length normalization.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message