lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: ThreadLocal causing memory leak with J2EE applications
Date Thu, 11 Sep 2008 14:43:14 GMT

OK so we compact the list (removing dead threads) every time we add a  
new entry to the list.  This way for a long lived SegmentReader but  
short lived threads, the list keeps only live threads.

We do need sync access to the list, but that's only on binding a new  
thread.  Retrieving an existing thread has no sync.

Mike

robert engels wrote:

> You still need to sync access to the list, and how would it be  
> removed from the list prior to close? That is you need one per  
> thread, but you can have the reader shared across all threads. So if  
> threads were created and destroyed without ever closing the reader,  
> the list would grow unbounded.
>
> On Sep 11, 2008, at 9:20 AM, Michael McCandless wrote:
>
>>
>> I don't need it by thread, because I would still use ThreadLocal to  
>> retrieve the SegmentTermEnum.  This avoids any sync during get.
>>
>> The list is just a "fallback" to hold a hard reference to the  
>> SegmentTermEnum to keep it alive.  That's it's only purpose.  Then,  
>> when SegmentReader is closed this list is cleared and GC is free to  
>> reclaim all SegmentTermEnums.
>>
>> Mike
>>
>> robert engels wrote:
>>
>>> But you need it by thread, so it can't be a list.
>>>
>>> You could have a HashMap of <Thread,ThreadState> in FieldsReader,  
>>> and when SegmentReader is closed, FieldsReader is closed, which  
>>> clears the map, and not use thread locals at all. The difference  
>>> being you would need a sync'd map.
>>>
>>> On Sep 11, 2008, at 4:56 AM, Michael McCandless wrote:
>>>
>>>>
>>>> What if we wrap the value in a WeakReference, but secondarily  
>>>> hold a hard reference to it in a "normal" list?
>>>>
>>>> Then, when TermInfosReader is closed we clear that list of all  
>>>> its hard references, at which point GC will be free to reclaim  
>>>> the object out from under the ThreadLocal even before the  
>>>> ThreadLocal purges its stale entries.
>>>>
>>>> Mike
>>>>
>>>> robert engels wrote:
>>>>
>>>>> You can't hold the ThreadLocal value in a WeakReference, because  
>>>>> there is no hard reference between enumeration calls (so it  
>>>>> would be cleared out from under you while enumerating).
>>>>>
>>>>> All of this occurs because you have some objects (readers/ 
>>>>> segments etc.) that are shared across all threads, but these  
>>>>> contain objects that are 'thread/search state' specific. These  
>>>>> latter objects are essentially "cached" for performance (so you  
>>>>> don't need to seek and read, sequential buffer access, etc.)
>>>>>
>>>>> A sometimes better solution is to have the state returned to the  
>>>>> caller, and require the caller to pass/use the state later -  
>>>>> then you don't need thread locals.
>>>>>
>>>>> You can accomplish a similar solution by returning a  
>>>>> "SessionKey" object, and have the caller pass this later.  You  
>>>>> can then have a WeakHashMap of SessionKey,SearchState that the  
>>>>> code can use.  When the SessionKey is destroyed (no longer  
>>>>> referenced), the state map can be cleaned up automatically.
>>>>>
>>>>>
>>>>>
>>>>> On Sep 10, 2008, at 11:30 PM, Noble Paul നോബിള്‍  
>>>>> नोब्ळ् wrote:
>>>>>
>>>>>> When I look at the reference tree That is the feeling I get. if 

>>>>>> you
>>>>>> held a WeakReference it would get released .
>>>>>> |- base of org.apache.lucene.index.CompoundFileReader 
>>>>>> $CSIndexInput
>>>>>>            |- input of org.apache.lucene.index.SegmentTermEnum
>>>>>>                |- value of java.lang.ThreadLocal$ThreadLocalMap 
>>>>>> $Entry
>>>>>>
>>>>>> On Wed, Sep 10, 2008 at 8:39 PM, Chris Lu <chris.lu@gmail.com>
 
>>>>>> wrote:
>>>>>>> Does this make any difference?
>>>>>>> If I intentionally close the searcher and reader failed to  
>>>>>>> release the
>>>>>>> memory, I can not rely on some magic of JVM to release it.
>>>>>>> --
>>>>>>> Chris Lu
>>>>>>> -------------------------
>>>>>>> Instant Scalable Full-Text Search On Any Database/Application
>>>>>>> site: http://www.dbsight.net
>>>>>>> demo: http://search.dbsight.com
>>>>>>> Lucene Database Search in 3 minutes:
>>>>>>> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
>>>>>>> DBSight customer, a shopping comparison site, (anonymous per
 
>>>>>>> request) got
>>>>>>> 2.6 Million Euro funding!
>>>>>>>
>>>>>>> On Wed, Sep 10, 2008 at 4:03 AM, Noble Paul  
>>>>>>> നോബിള്‍ नोब्ळ्
>>>>>>> <noble.paul@gmail.com> wrote:
>>>>>>>>
>>>>>>>> Why do you need to keep a strong reference?
>>>>>>>> Why not a WeakReference ?
>>>>>>>>
>>>>>>>> --Noble
>>>>>>>>
>>>>>>>> On Wed, Sep 10, 2008 at 12:27 AM, Chris Lu  
>>>>>>>> <chris.lu@gmail.com> wrote:
>>>>>>>>> The problem should be similar to what's talked about
on this  
>>>>>>>>> discussion.
>>>>>>>>> http://lucene.markmail.org/message/keosgz2c2yjc7qre?q=ThreadLocal
>>>>>>>>>
>>>>>>>>> There is a memory leak for Lucene search from Lucene-1195.

>>>>>>>>> (svn r659602,
>>>>>>>>> May23,2008)
>>>>>>>>>
>>>>>>>>> This patch brings in a ThreadLocal cache to TermInfosReader.
>>>>>>>>>
>>>>>>>>> It's usually recommended to keep the reader open, and
reuse  
>>>>>>>>> it when
>>>>>>>>> possible. In a common J2EE application, the http requests
 
>>>>>>>>> are usually
>>>>>>>>> handled by different threads. But since the cache is
 
>>>>>>>>> ThreadLocal, the
>>>>>>>>> cache
>>>>>>>>> are not really usable by other threads. What's worse,
the  
>>>>>>>>> cache can not
>>>>>>>>> be
>>>>>>>>> cleared by another thread!
>>>>>>>>>
>>>>>>>>> This leak is not so obvious usually. But my case is using
 
>>>>>>>>> RAMDirectory,
>>>>>>>>> having several hundred megabytes. So one un-released
 
>>>>>>>>> resource is obvious
>>>>>>>>> to
>>>>>>>>> me.
>>>>>>>>>
>>>>>>>>> Here is the reference tree:
>>>>>>>>> org.apache.lucene.store.RAMDirectory
>>>>>>>>> |- directory of org.apache.lucene.store.RAMFile
>>>>>>>>>   |- file of org.apache.lucene.store.RAMInputStream
>>>>>>>>>       |- base of
>>>>>>>>> org.apache.lucene.index.CompoundFileReader$CSIndexInput
>>>>>>>>>           |- input of org.apache.lucene.index.SegmentTermEnum
>>>>>>>>>               |- value of java.lang.ThreadLocal 
>>>>>>>>> $ThreadLocalMap$Entry
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> After I switched back to svn revision 659601, right before
 
>>>>>>>>> this patch is
>>>>>>>>> checked in, the memory leak is gone.
>>>>>>>>> Although my case is RAMDirectory, I believe this will
affect  
>>>>>>>>> disk based
>>>>>>>>> index also.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Chris Lu
>>>>>>>>> -------------------------
>>>>>>>>> Instant Scalable Full-Text Search On Any Database/Application
>>>>>>>>> site: http://www.dbsight.net
>>>>>>>>> demo: http://search.dbsight.com
>>>>>>>>> Lucene Database Search in 3 minutes:
>>>>>>>>>
>>>>>>>>> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
>>>>>>>>> DBSight customer, a shopping comparison site, (anonymous
per  
>>>>>>>>> request)
>>>>>>>>> got
>>>>>>>>> 2.6 Million Euro funding!
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> --Noble Paul
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>>>>> For additional commands, e-mail: java-dev- 
>>>>>>>> help@lucene.apache.org
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> --Noble Paul
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message