lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mark harwood <markharw...@yahoo.co.uk>
Subject Re: Out of memory - CachingWrappperFilter and multiple threads
Date Mon, 18 Feb 2008 16:08:50 GMT
>>Even though you only have two threads... how many different  IndexReader instances
do you have?

One reader - kept open throughout.

>>I think the only way to make the current version
>>work correctly is to make whole method synchronized on this
>>object, i.e. declare it as public synchronized, and then
>>the synchronized(cache) occurrences can be removed.

Analysing the pros and cons of such a  change...
1) Synchronisation costs for  pre-cached content on the same reader (hopefully the most common
case) would be just as quick as it is now i.e. swapping a synchronised(this) for synchronised(cache).
2) Synchronisation for uncached content would avoid reading multiple copies of the same bitset
(one would expect overall faster responses in this scenario) 
3) However, there would be an additional synchronisation cost for threads using unrelated
readers - a thread using an old reader with pre-cached content would be delayed (potentially
significantly) by a thread caching new content on a new reader.

This last one sounds a bit nasty?

Cheers
Mark

----- Original Message ----
From: Paul Elschot <paul.elschot@xs4all.nl>
To: java-dev@lucene.apache.org
Sent: Monday, 18 February, 2008 3:23:01 PM
Subject: Re: Out of memory - CachingWrappperFilter and multiple threads

I think this code has been discussed before, but I can't find
the occasion on which that happened.

Actually, this code may do some things like creating two different
caches, and as you said calling filter.bits(reader) in parallel.

Creating different caches does not hurt because in the end
one only one will survive the assignment,  but it's not nice as
different threads might synchronize on different caches
for the first cache.get() call, and I don't know whether
java guarantees that the cache.get() call will synchronize
on the same cache as the cache in the cache.get()
call in this case.

Doing the filter.bits(reader) call without synchronization
is probably on purpose: it's better not to hold any visible
lock while calling outside an object.

As it stands, I think the only way to make the current version
work correctly is to make whole method synchronized on this
object, i.e. declare it as public synchronized, and then
the synchronized(cache) occurrences can be removed.

It might be better to initialize the cache in the constructor,
and then synchronize on the cache while even while
calling filter.bits(reader). This is safe when the cache is private.

Regards,
Paul Elschot



Op Monday 18 February 2008 13:50:16 schreef mark harwood:
> I'm chasing down a bug in my application where multiple threads were readingand caching
the same filter (same very common term, big index) and causedan Out of Memory exception when
I would expect there to be plenty ofmemory to spare.
> There's a number of layers to this app to investigate (I was using theXMLQueryParser
and the CachedFilter tag too) but CachingWrapperFilterunderpins all this stuff and I was led
to this code in it...
> 
>   public BitSet bits(IndexReader reader) throws IOException {
>     if (cache == null) {
>       cache = new WeakHashMap();
>     }
> 
>     synchronized (cache) {  // check cache
>       BitSet cached = (BitSet) cache.get(reader);
>       if (cached != null) {
>         return cached;
>       }
>     }
> 
>     final BitSet bits = filter.bits(reader);
> 
>     synchronized (cache) {  // update cache
>       cache.put(reader, bits);
>     }
> 
>     return bits;
>   }
> 
> 
> The first observation is - why the use of"final" for the variable "bits" ?  Would there
be anyside-effects to this? 
> 
> Perhaps more worryingly I can see that multiple threads asking for the same bitset simultaneously
arelikely to unnecessarily read the same data from the same reader (butultimately only one
bitset should end up cached). My app only had 2 simultaneous threads on the same reader so
I don't see how that accounts for the large memory bloat I saw. In a high traffic environment
though, I can see multiple requests for a popular term getting bottle-necked here creating
the same bitset and causing an OOM error. It looks like this multiple-load scenario could/should
be avoided with some careful synchronisation.
> 
> Unfortunately I've been unable to reproduce my OOM problem outside of the live environment
so can't fully pinpoint my particular issue or the solution just yet. 
> 
> Thoughts?
> Mark
> 
> 
> 
> 
>       __________________________________________________________
> Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
> 
> 
> 
> 







      __________________________________________________________
Sent from Yahoo! Mail - a smarter inbox http://uk.mail.yahoo.com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message