abdera-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Berry <chriswbe...@gmail.com>
Subject Re: memory leak
Date Tue, 06 Nov 2007 00:48:18 GMT
James,
I wonder if it might be prudent to backport this change into 0.3.0??
As Bryon said, it is a very substantial leak -- leaking >1GB in a  
matter of minutes in our load tests.
Although, it is relatively easy to workaround.
Cheers,
-- Chris 


On Nov 5, 2007, at 5:39 PM, James M Snell wrote:

> Ok, I went ahead and committed the change that removes the wrapper
> cache.  Things seem to be working fine.  Please kick the tires and see
> if things are working out.
>
> - James
>
> Bryon Jacob wrote:
>> Hey All -
>>
>> I'm working on a data service based on Abdera (working with Chris  
>> Berry,
>> who's a regular on these lists...)  When we were running our first
>> battery of serious load testing on our system, we encountered
>> memory-leaky behavior, and a profiler showed us that we were indeed
>> leaking hundreds of megabytes a minute, all traceable back to the
>> wrappers field on org.apache.abdera.factory.ExtensionFactoryMap.   
>> This
>> field is a map from elements to their wrappers, if any.  At first,  
>> I was
>> puzzled by the memory leak, as the field is initialized thusly:
>>
>>     this.wrappers = Collections.synchronizedMap( new
>> WeakHashMap<Element,Element>());
>>
>> clearly, the implementor took care to make sure that this cache would
>> not leak by making it a WeakHashMap, which generally guarantees  
>> that the
>> map itself will not keep a key and its corresponding entry from being
>> garbage collected.  I dug throughout our application code to find  
>> if we
>> were actually holding other references to these objects, and I  
>> googled
>> for anyone having problems with esoteric interactions between
>> Collections.synchronizedMap and WeakHashMaps - found nothing there.
>> Then I went back to square one and re-read the WeakHashMap javadoc  
>> very
>> carefully.  Here's the relevant section:
>>
>>     Implementation note: The value objects in a WeakHashMap are  
>> held by
>> ordinary strong references. Thus care should be taken to ensure that
>> value objects do not strongly refer to their own keys, either  
>> directly
>> or indirectly, since that will prevent the keys from being discarded.
>> Note that a value object may refer indirectly to its key via the
>> WeakHashMap itself; that is, a value object may strongly refer to  
>> some
>> other key object whose associated value object, in turn, strongly  
>> refers
>> to the key of the first value object. One way to deal with this is to
>> wrap values themselves within WeakReferences before inserting, as in:
>> m.put(key, new WeakReference(value)), and then unwrapping upon  
>> each get.
>>
>> This is why there is a memory leak - the map is a mapping from  
>> elements
>> to their wrappers - by the very nature of the object being a  
>> wrapper of
>> the element, it will usually have a strong reference to the element
>> itself, which is the key! You can verify that Abdera wrappers, in
>> general, will do this by looking at
>> org.apache.abdera.model.ElementWrapper, which takes the element being
>> wrapped as its constructor argument, and holds a strong reference  
>> to it
>> as an instance variable.
>>
>> This map is an optimization to memoize the calls to  
>> getElementWrapper()
>> and not reconstruct them more than is necessary - it is not needed  
>> for
>> abdera to function properly, so we have temporarily worked around the
>> problem in our own application like so - we used to acquire our
>> FOMFactory by calling abdera.getFactory() on our
>> org.apache.abdera.Abdera instance, and re-using that singleton
>> throughout our application.  Now we construct a new FOMFactory  
>> with new
>> FOMFactory(abdera) once per request to the server, and since the only
>> appreciable state on the factory is this map itself, this is a valid
>> work-around.
>>
>> I'd initially planned to really fix this issue and submit a patch  
>> along
>> with this message, but digging a little deeper, I'm not sure that the
>> correct fix is crystal clear...  We could do as the javadoc above
>> suggests, and wrap the values with WeakReferences to plug the  
>> leak, or
>> we could use a LinkedHashMap configured as an LRU cache to just bound
>> the cache, so it can't grow out of control - but right now, I don't
>> think that either of those solutions would be correct, because it  
>> seems
>> to me that none of the objects in the hierarchy rooted at FOMElement
>> define equals() and/or hashCode() methods, so all of the objects are
>> cached based on their actual object identity.  It seems that in  
>> the all
>> likely use cases, instances of FOMElement and its descendants are
>> re-parsed on every request to a server running abdera, and so what we
>> will see is cache misses virtually 100% of the time, so even  
>> though we'd
>> have plugged the leak, strictly speaking, we would have ignored the
>> underlying issue that we're caching data on every request that  
>> will be
>> fundamentally unable to be retrieved on subsequent requests.  This is
>> based only on my reading over the code for a few hours, so I could be
>> missing something, and I also might be forgetting about a use case  
>> that
>> demands and makes proper use of this memoization, but as it stands  
>> right
>> now, my recommended fix would probably be to just cut out the cache
>> altogether, and allow for wrappers to get constructed fresh every  
>> time
>> they are requested.  One more possibility is that the cache is  
>> actually
>> a useful optimization, but only during the scope of one request - in
>> which case the "work-around" we are using now is actually the best
>> practice, and the fix would be to remove the factory instance on the
>> Abdera class...
>>
>> I'd like to hear from the Abdera developers what their thoughts  
>> are on
>> this issue, and what the best resolution is likely to be.  This is no
>> longer a pressing issue for our team, but it is potentially a time  
>> bomb
>> waiting to blow up for any project dependent on Abdera.
>>
>> thanks!  (and thanks for Abdera, generally - we're easily a year  
>> ahead
>> of where we'd be on this project without it!)
>>
>> -Bryon (long-time listener, first-time caller)
>>
>>

S'all good  ---   chriswberry at gmail dot com




Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message