lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <>
Subject Re: FieldCache Question
Date Wed, 04 Feb 2009 17:41:49 GMT
Todd Benge wrote:
> The intent is to reduce the amount of memory that is held in cache.  As it
> is now, it looks like there is an array of comparators for each index
> reader.  Most of the data in the array appears to be the same for each cache
> so there is duplication for each type ( string, float).
Use an array cachekey and override it as not mergeable.

I suppose in terms of the unuiqes terms array, you could see some 

I don't think there should be much duplication though - in the non 
String cases, each SegmentIndexReader will only hold the values for 
itself. The size of the sub arrays would be the same as the full array.
In the String case, you will have duplicates for the unique terms array, 
so if you have a lot, that may cause issues, but the ordinal array will 
not be any larger. And the unuiqe terms array shouldnt be terrible - the 
number of terms per segment should drop logarithmically. I'm not sure 
you'll see much of a difference, and it would only be with String sorts.

That is, unless you are creating your own separate FieldCaches on 
multisegmentreaders - then you would double everything.

> Yes - we're runnning about 80G in the indices so there's not enough RAM for
> all the data in the fieldcache.
That is a large index. Can you share how many documents?

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message