lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koji Sekiguchi <>
Subject Re: FieldCache memory estimation - term values are interned?
Date Sun, 02 May 2010 00:23:21 GMT
Yonik Seeley wrote:
> 2010/4/30 Koji Sekiguchi <>:
>> Are Strings that are got via FieldCache.DEFAULT.getStrings( reader,
>> field ) interned?
>> Since I have a requirement for having FieldCaches of some
>> fields in 250M docs index, I'd like to estimate memory
>> consumed by FieldCache.
>> By looking at FieldCacheImpl source code, it seems that
>> field names are interned, but values are not?
> Values are not interned, but in a single field cache entry (String[])
> the same String object is used for all docs with that same value.
Yeah, you are right. Because I could see the arbitrary two Strings that have
same value were identical (== is true) in String[] gotten by 
getStrings() but
I coudn't see explicit intern() for them, I asked. Can you elaborate who 
done it?

> But... I think StringIndex is more commonly used in both Lucene and
> Solr than String[] (sorting, faceting, etc) so double check that it's
> not StringIndex you should be looking at.
Yes, I think StringIndex is better than String[] for my requirement in terms
of memory consumption as well (an extreme case, male/female field etc.).

Thank you,



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message