lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert engels <reng...@ix.netcom.com>
Subject Re: ThreadLocal causing memory leak with J2EE applications
Date Thu, 11 Sep 2008 14:39:31 GMT
You still need to sync access to the list, and how would it be  
removed from the list prior to close? That is you need one per  
thread, but you can have the reader shared across all threads. So if  
threads were created and destroyed without ever closing the reader,  
the list would grow unbounded.

On Sep 11, 2008, at 9:20 AM, Michael McCandless wrote:

>
> I don't need it by thread, because I would still use ThreadLocal to  
> retrieve the SegmentTermEnum.  This avoids any sync during get.
>
> The list is just a "fallback" to hold a hard reference to the  
> SegmentTermEnum to keep it alive.  That's it's only purpose.  Then,  
> when SegmentReader is closed this list is cleared and GC is free to  
> reclaim all SegmentTermEnums.
>
> Mike
>
> robert engels wrote:
>
>> But you need it by thread, so it can't be a list.
>>
>> You could have a HashMap of <Thread,ThreadState> in FieldsReader,  
>> and when SegmentReader is closed, FieldsReader is closed, which  
>> clears the map, and not use thread locals at all. The difference  
>> being you would need a sync'd map.
>>
>> On Sep 11, 2008, at 4:56 AM, Michael McCandless wrote:
>>
>>>
>>> What if we wrap the value in a WeakReference, but secondarily  
>>> hold a hard reference to it in a "normal" list?
>>>
>>> Then, when TermInfosReader is closed we clear that list of all  
>>> its hard references, at which point GC will be free to reclaim  
>>> the object out from under the ThreadLocal even before the  
>>> ThreadLocal purges its stale entries.
>>>
>>> Mike
>>>
>>> robert engels wrote:
>>>
>>>> You can't hold the ThreadLocal value in a WeakReference, because  
>>>> there is no hard reference between enumeration calls (so it  
>>>> would be cleared out from under you while enumerating).
>>>>
>>>> All of this occurs because you have some objects (readers/ 
>>>> segments etc.) that are shared across all threads, but these  
>>>> contain objects that are 'thread/search state' specific. These  
>>>> latter objects are essentially "cached" for performance (so you  
>>>> don't need to seek and read, sequential buffer access, etc.)
>>>>
>>>> A sometimes better solution is to have the state returned to the  
>>>> caller, and require the caller to pass/use the state later -  
>>>> then you don't need thread locals.
>>>>
>>>> You can accomplish a similar solution by returning a  
>>>> "SessionKey" object, and have the caller pass this later.  You  
>>>> can then have a WeakHashMap of SessionKey,SearchState that the  
>>>> code can use.  When the SessionKey is destroyed (no longer  
>>>> referenced), the state map can be cleaned up automatically.
>>>>
>>>>
>>>>
>>>> On Sep 10, 2008, at 11:30 PM, Noble Paul  
>>>> നോബിള്‍ नोब्ळ् wrote:
>>>>
>>>>> When I look at the reference tree That is the feeling I get. if  
>>>>> you
>>>>> held a WeakReference it would get released .
>>>>> |- base of org.apache.lucene.index.CompoundFileReader$CSIndexInput
>>>>>             |- input of org.apache.lucene.index.SegmentTermEnum
>>>>>                 |- value of java.lang.ThreadLocal$ThreadLocalMap 
>>>>> $Entry
>>>>>
>>>>> On Wed, Sep 10, 2008 at 8:39 PM, Chris Lu <chris.lu@gmail.com>
 
>>>>> wrote:
>>>>>> Does this make any difference?
>>>>>> If I intentionally close the searcher and reader failed to  
>>>>>> release the
>>>>>> memory, I can not rely on some magic of JVM to release it.
>>>>>> --
>>>>>> Chris Lu
>>>>>> -------------------------
>>>>>> Instant Scalable Full-Text Search On Any Database/Application
>>>>>> site: http://www.dbsight.net
>>>>>> demo: http://search.dbsight.com
>>>>>> Lucene Database Search in 3 minutes:
>>>>>> http://wiki.dbsight.com/index.php? 
>>>>>> title=Create_Lucene_Database_Search_in_3_minutes
>>>>>> DBSight customer, a shopping comparison site, (anonymous per  
>>>>>> request) got
>>>>>> 2.6 Million Euro funding!
>>>>>>
>>>>>> On Wed, Sep 10, 2008 at 4:03 AM, Noble Paul  
>>>>>> നോബിള്‍ नोब्ळ्
>>>>>> <noble.paul@gmail.com> wrote:
>>>>>>>
>>>>>>> Why do you need to keep a strong reference?
>>>>>>> Why not a WeakReference ?
>>>>>>>
>>>>>>> --Noble
>>>>>>>
>>>>>>> On Wed, Sep 10, 2008 at 12:27 AM, Chris Lu  
>>>>>>> <chris.lu@gmail.com> wrote:
>>>>>>>> The problem should be similar to what's talked about on this
 
>>>>>>>> discussion.
>>>>>>>> http://lucene.markmail.org/message/keosgz2c2yjc7qre? 
>>>>>>>> q=ThreadLocal
>>>>>>>>
>>>>>>>> There is a memory leak for Lucene search from Lucene-1195.

>>>>>>>> (svn r659602,
>>>>>>>> May23,2008)
>>>>>>>>
>>>>>>>> This patch brings in a ThreadLocal cache to TermInfosReader.
>>>>>>>>
>>>>>>>> It's usually recommended to keep the reader open, and reuse
 
>>>>>>>> it when
>>>>>>>> possible. In a common J2EE application, the http requests
 
>>>>>>>> are usually
>>>>>>>> handled by different threads. But since the cache is  
>>>>>>>> ThreadLocal, the
>>>>>>>> cache
>>>>>>>> are not really usable by other threads. What's worse, the
 
>>>>>>>> cache can not
>>>>>>>> be
>>>>>>>> cleared by another thread!
>>>>>>>>
>>>>>>>> This leak is not so obvious usually. But my case is using
 
>>>>>>>> RAMDirectory,
>>>>>>>> having several hundred megabytes. So one un-released  
>>>>>>>> resource is obvious
>>>>>>>> to
>>>>>>>> me.
>>>>>>>>
>>>>>>>> Here is the reference tree:
>>>>>>>> org.apache.lucene.store.RAMDirectory
>>>>>>>> |- directory of org.apache.lucene.store.RAMFile
>>>>>>>>    |- file of org.apache.lucene.store.RAMInputStream
>>>>>>>>        |- base of
>>>>>>>> org.apache.lucene.index.CompoundFileReader$CSIndexInput
>>>>>>>>            |- input of org.apache.lucene.index.SegmentTermEnum
>>>>>>>>                |- value of java.lang.ThreadLocal 
>>>>>>>> $ThreadLocalMap$Entry
>>>>>>>>
>>>>>>>>
>>>>>>>> After I switched back to svn revision 659601, right before
 
>>>>>>>> this patch is
>>>>>>>> checked in, the memory leak is gone.
>>>>>>>> Although my case is RAMDirectory, I believe this will affect
 
>>>>>>>> disk based
>>>>>>>> index also.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Chris Lu
>>>>>>>> -------------------------
>>>>>>>> Instant Scalable Full-Text Search On Any Database/Application
>>>>>>>> site: http://www.dbsight.net
>>>>>>>> demo: http://search.dbsight.com
>>>>>>>> Lucene Database Search in 3 minutes:
>>>>>>>>
>>>>>>>> http://wiki.dbsight.com/index.php? 
>>>>>>>> title=Create_Lucene_Database_Search_in_3_minutes
>>>>>>>> DBSight customer, a shopping comparison site, (anonymous
per  
>>>>>>>> request)
>>>>>>>> got
>>>>>>>> 2.6 Million Euro funding!
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> --Noble Paul
>>>>>>>
>>>>>>> ----------------------------------------------------------------

>>>>>>> -----
>>>>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> --Noble Paul
>>>>
>>>>
>>>> ------------------------------------------------------------------- 
>>>> --
>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message