lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <paul.elsc...@xs4all.nl>
Subject Re: Out of memory - CachingWrappperFilter and multiple threads
Date Mon, 18 Feb 2008 16:38:10 GMT
Op Monday 18 February 2008 17:08:50 schreef mark harwood:
> >>Even though you only have two threads... how many different  IndexReader instances
do you have?
> 
> One reader - kept open throughout.
> 
> >>I think the only way to make the current version
> >>work correctly is to make whole method synchronized on this
> >>object, i.e. declare it as public synchronized, and then
> >>the synchronized(cache) occurrences can be removed.
> 
> Analysing the pros and cons of such a  change...
> 1) Synchronisation costs for  pre-cached content on the same reader (hopefully the most
common case) would be just as quick as it is now i.e. swapping a synchronised(this) for synchronised(cache).
> 2) Synchronisation for uncached content would avoid reading multiple copies of the same
bitset (one would expect overall faster responses in this scenario) 
> 3) However, there would be an additional synchronisation cost for threads using unrelated
readers - a thread using an old reader with pre-cached content would be delayed (potentially
significantly) by a thread caching new content on a new reader.
> 
> This last one sounds a bit nasty?

Yes, that could be problematic. 

But I'm afraid more detailed information will be needed before
this OOM can be solved.

Regards,
Paul Elschot

> 
> Cheers
> Mark
> 
> ----- Original Message ----
> From: Paul Elschot <paul.elschot@xs4all.nl>
> To: java-dev@lucene.apache.org
> Sent: Monday, 18 February, 2008 3:23:01 PM
> Subject: Re: Out of memory - CachingWrappperFilter and multiple threads
> 
> I think this code has been discussed before, but I can't find
> the occasion on which that happened.
> 
> Actually, this code may do some things like creating two different
> caches, and as you said calling filter.bits(reader) in parallel.
> 
> Creating different caches does not hurt because in the end
> one only one will survive the assignment,  but it's not nice as
> different threads might synchronize on different caches
> for the first cache.get() call, and I don't know whether
> java guarantees that the cache.get() call will synchronize
> on the same cache as the cache in the cache.get()
> call in this case.
> 
> Doing the filter.bits(reader) call without synchronization
> is probably on purpose: it's better not to hold any visible
> lock while calling outside an object.
> 
> As it stands, I think the only way to make the current version
> work correctly is to make whole method synchronized on this
> object, i.e. declare it as public synchronized, and then
> the synchronized(cache) occurrences can be removed.
> 
> It might be better to initialize the cache in the constructor,
> and then synchronize on the cache while even while
> calling filter.bits(reader). This is safe when the cache is private.
> 
> Regards,
> Paul Elschot
> 
> 
> 
> Op Monday 18 February 2008 13:50:16 schreef mark harwood:
> > I'm chasing down a bug in my application where multiple threads were readingand
caching the same filter (same very common term, big index) and causedan Out of Memory exception
when I would expect there to be plenty ofmemory to spare.
> > There's a number of layers to this app to investigate (I was using theXMLQueryParser
and the CachedFilter tag too) but CachingWrapperFilterunderpins all this stuff and I was led
to this code in it...
> > 
> >   public BitSet bits(IndexReader reader) throws IOException {
> >     if (cache == null) {
> >       cache = new WeakHashMap();
> >     }
> > 
> >     synchronized (cache) {  // check cache
> >       BitSet cached = (BitSet) cache.get(reader);
> >       if (cached != null) {
> >         return cached;
> >       }
> >     }
> > 
> >     final BitSet bits = filter.bits(reader);
> > 
> >     synchronized (cache) {  // update cache
> >       cache.put(reader, bits);
> >     }
> > 
> >     return bits;
> >   }
> > 
> > 
> > The first observation is - why the use of"final" for the variable "bits" ?  Would
there be anyside-effects to this? 
> > 
> > Perhaps more worryingly I can see that multiple threads asking for the same bitset
simultaneously arelikely to unnecessarily read the same data from the same reader (butultimately
only one bitset should end up cached). My app only had 2 simultaneous threads on the same
reader so I don't see how that accounts for the large memory bloat I saw. In a high traffic
environment though, I can see multiple requests for a popular term getting bottle-necked here
creating the same bitset and causing an OOM error. It looks like this multiple-load scenario
could/should be avoided with some careful synchronisation.
> > 
> > Unfortunately I've been unable to reproduce my OOM problem outside of the live environment
so can't fully pinpoint my particular issue or the solution just yet. 
> > 
> > Thoughts?
> > Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message