lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <>
Subject Re: LUCENE-831 (complete cache overhaul) -> mem use
Date Fri, 14 Nov 2008 20:26:41 GMT
Its hard to predict the future of LUCENE-831. I would bet that it will 
end up in Lucene at some point in one form or another, but its hard to 
say if that form will be whats in the available patches (I'm a contrib 
committer so I won't have any real say in that, so take that prediction 
with a grain of salt). It has strong ties to other issues and a 
committer hasn't really had their whack at it yet.

Having said that though, LUCENE-831 allows for two types for dealing 
with field values: either the old style int/string/long/etc arrays, or 
for a small speed hit and faster reopens, an ArrayObject type that is 
basically an Object that can provide access to one or two real or 
virtual arrays. So technically you could use an ArrayObject that had a 
sparse implementation behind it. Unfortunately, you would have to 
implement new CachKeys to do this. Trivial to do, but reveals our 
LUCENE-831 problem of exponential cachkey increases with every new 
little option/idea and the juggling of which to use. I havn't thought 
about it, but I'm hoping an API tweak can alleviate some of this.

- Mark

Britske wrote:
> Hi, 
> I recently saw activity on LUCENE-831 (Complete overhaul of FieldCache
> API/Implementation) which I have interest in. 
> I posted previously on this with my concern that given the current default
> cache I sometimes get OOM-errors because I have a lot of fields which are
> sorted on, which ultimately causes the fieldcache to grow greater then
> available RAM. 
> ultimately I want to subclass the new pluggable Fieldcache of lucene-831 to
> offload to disk (using ehcache or memcachedB or something) but havn't found
> the time yet. 
> What I would like to know for now is if perhaps the newly implemented
> standard cache in LUCENE-831 uses another strategy of caching than the
> standard Fieldcache in Lucene. 
> i.e: The normal cache consumes memory while generating a fieldcache for
> every document in lucene even though the document hasn't got that field set. 
> Since my documents are very sparse in these fields I want to sort on it
> would differ a_lot when documents that don't have the field in question set
> don't add up in the used memory. 
> So am I lucky? Or would I indeed have to cook up something myself? 
> Thanks and best regards,
> Geert-Jan

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message