lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert engels <reng...@ix.netcom.com>
Subject Re: ThreadLocal causing memory leak with J2EE applications
Date Thu, 11 Sep 2008 15:01:18 GMT
I think that would work, but I think you would be better off  
encapsulating that in an extended ThreadLocal, e.g. WeakThreadLocal,  
and use that every where. Add a method clear(), that clears the  
ThreadLocals list (which will allow the values to be GC'd).


On Sep 11, 2008, at 9:43 AM, Michael McCandless wrote:

>
> OK so we compact the list (removing dead threads) every time we add  
> a new entry to the list.  This way for a long lived SegmentReader  
> but short lived threads, the list keeps only live threads.
>
> We do need sync access to the list, but that's only on binding a  
> new thread.  Retrieving an existing thread has no sync.
>
> Mike
>
> robert engels wrote:
>
>> You still need to sync access to the list, and how would it be  
>> removed from the list prior to close? That is you need one per  
>> thread, but you can have the reader shared across all threads. So  
>> if threads were created and destroyed without ever closing the  
>> reader, the list would grow unbounded.
>>
>> On Sep 11, 2008, at 9:20 AM, Michael McCandless wrote:
>>
>>>
>>> I don't need it by thread, because I would still use ThreadLocal  
>>> to retrieve the SegmentTermEnum.  This avoids any sync during get.
>>>
>>> The list is just a "fallback" to hold a hard reference to the  
>>> SegmentTermEnum to keep it alive.  That's it's only purpose.   
>>> Then, when SegmentReader is closed this list is cleared and GC is  
>>> free to reclaim all SegmentTermEnums.
>>>
>>> Mike
>>>
>>> robert engels wrote:
>>>
>>>> But you need it by thread, so it can't be a list.
>>>>
>>>> You could have a HashMap of <Thread,ThreadState> in  
>>>> FieldsReader, and when SegmentReader is closed, FieldsReader is  
>>>> closed, which clears the map, and not use thread locals at all.  
>>>> The difference being you would need a sync'd map.
>>>>
>>>> On Sep 11, 2008, at 4:56 AM, Michael McCandless wrote:
>>>>
>>>>>
>>>>> What if we wrap the value in a WeakReference, but secondarily  
>>>>> hold a hard reference to it in a "normal" list?
>>>>>
>>>>> Then, when TermInfosReader is closed we clear that list of all  
>>>>> its hard references, at which point GC will be free to reclaim  
>>>>> the object out from under the ThreadLocal even before the  
>>>>> ThreadLocal purges its stale entries.
>>>>>
>>>>> Mike
>>>>>
>>>>> robert engels wrote:
>>>>>
>>>>>> You can't hold the ThreadLocal value in a WeakReference,  
>>>>>> because there is no hard reference between enumeration calls  
>>>>>> (so it would be cleared out from under you while enumerating).
>>>>>>
>>>>>> All of this occurs because you have some objects (readers/ 
>>>>>> segments etc.) that are shared across all threads, but these  
>>>>>> contain objects that are 'thread/search state' specific. These  
>>>>>> latter objects are essentially "cached" for performance (so  
>>>>>> you don't need to seek and read, sequential buffer access, etc.)
>>>>>>
>>>>>> A sometimes better solution is to have the state returned to  
>>>>>> the caller, and require the caller to pass/use the state later  
>>>>>> - then you don't need thread locals.
>>>>>>
>>>>>> You can accomplish a similar solution by returning a  
>>>>>> "SessionKey" object, and have the caller pass this later.  You  
>>>>>> can then have a WeakHashMap of SessionKey,SearchState that the  
>>>>>> code can use.  When the SessionKey is destroyed (no longer  
>>>>>> referenced), the state map can be cleaned up automatically.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sep 10, 2008, at 11:30 PM, Noble Paul  
>>>>>> നോബിള്‍ नोब्ळ् wrote:
>>>>>>
>>>>>>> When I look at the reference tree That is the feeling I get.
 
>>>>>>> if you
>>>>>>> held a WeakReference it would get released .
>>>>>>> |- base of org.apache.lucene.index.CompoundFileReader 
>>>>>>> $CSIndexInput
>>>>>>>            |- input of org.apache.lucene.index.SegmentTermEnum
>>>>>>>                |- value of java.lang.ThreadLocal 
>>>>>>> $ThreadLocalMap$Entry
>>>>>>>
>>>>>>> On Wed, Sep 10, 2008 at 8:39 PM, Chris Lu  
>>>>>>> <chris.lu@gmail.com> wrote:
>>>>>>>> Does this make any difference?
>>>>>>>> If I intentionally close the searcher and reader failed to
 
>>>>>>>> release the
>>>>>>>> memory, I can not rely on some magic of JVM to release it.
>>>>>>>> --
>>>>>>>> Chris Lu
>>>>>>>> -------------------------
>>>>>>>> Instant Scalable Full-Text Search On Any Database/Application
>>>>>>>> site: http://www.dbsight.net
>>>>>>>> demo: http://search.dbsight.com
>>>>>>>> Lucene Database Search in 3 minutes:
>>>>>>>> http://wiki.dbsight.com/index.php? 
>>>>>>>> title=Create_Lucene_Database_Search_in_3_minutes
>>>>>>>> DBSight customer, a shopping comparison site, (anonymous
per  
>>>>>>>> request) got
>>>>>>>> 2.6 Million Euro funding!
>>>>>>>>
>>>>>>>> On Wed, Sep 10, 2008 at 4:03 AM, Noble Paul  
>>>>>>>> നോബിള്‍ नोब्ळ्
>>>>>>>> <noble.paul@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> Why do you need to keep a strong reference?
>>>>>>>>> Why not a WeakReference ?
>>>>>>>>>
>>>>>>>>> --Noble
>>>>>>>>>
>>>>>>>>> On Wed, Sep 10, 2008 at 12:27 AM, Chris Lu  
>>>>>>>>> <chris.lu@gmail.com> wrote:
>>>>>>>>>> The problem should be similar to what's talked about
on  
>>>>>>>>>> this discussion.
>>>>>>>>>> http://lucene.markmail.org/message/keosgz2c2yjc7qre?

>>>>>>>>>> q=ThreadLocal
>>>>>>>>>>
>>>>>>>>>> There is a memory leak for Lucene search from Lucene-1195.

>>>>>>>>>> (svn r659602,
>>>>>>>>>> May23,2008)
>>>>>>>>>>
>>>>>>>>>> This patch brings in a ThreadLocal cache to TermInfosReader.
>>>>>>>>>>
>>>>>>>>>> It's usually recommended to keep the reader open,
and  
>>>>>>>>>> reuse it when
>>>>>>>>>> possible. In a common J2EE application, the http
requests  
>>>>>>>>>> are usually
>>>>>>>>>> handled by different threads. But since the cache
is  
>>>>>>>>>> ThreadLocal, the
>>>>>>>>>> cache
>>>>>>>>>> are not really usable by other threads. What's worse,
the  
>>>>>>>>>> cache can not
>>>>>>>>>> be
>>>>>>>>>> cleared by another thread!
>>>>>>>>>>
>>>>>>>>>> This leak is not so obvious usually. But my case
is using  
>>>>>>>>>> RAMDirectory,
>>>>>>>>>> having several hundred megabytes. So one un-released
 
>>>>>>>>>> resource is obvious
>>>>>>>>>> to
>>>>>>>>>> me.
>>>>>>>>>>
>>>>>>>>>> Here is the reference tree:
>>>>>>>>>> org.apache.lucene.store.RAMDirectory
>>>>>>>>>> |- directory of org.apache.lucene.store.RAMFile
>>>>>>>>>>   |- file of org.apache.lucene.store.RAMInputStream
>>>>>>>>>>       |- base of
>>>>>>>>>> org.apache.lucene.index.CompoundFileReader$CSIndexInput
>>>>>>>>>>           |- input of org.apache.lucene.index.SegmentTermEnum
>>>>>>>>>>               |- value of java.lang.ThreadLocal 
>>>>>>>>>> $ThreadLocalMap$Entry
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> After I switched back to svn revision 659601, right
before  
>>>>>>>>>> this patch is
>>>>>>>>>> checked in, the memory leak is gone.
>>>>>>>>>> Although my case is RAMDirectory, I believe this
will  
>>>>>>>>>> affect disk based
>>>>>>>>>> index also.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Chris Lu
>>>>>>>>>> -------------------------
>>>>>>>>>> Instant Scalable Full-Text Search On Any Database/Application
>>>>>>>>>> site: http://www.dbsight.net
>>>>>>>>>> demo: http://search.dbsight.com
>>>>>>>>>> Lucene Database Search in 3 minutes:
>>>>>>>>>>
>>>>>>>>>> http://wiki.dbsight.com/index.php? 
>>>>>>>>>> title=Create_Lucene_Database_Search_in_3_minutes
>>>>>>>>>> DBSight customer, a shopping comparison site, (anonymous
 
>>>>>>>>>> per request)
>>>>>>>>>> got
>>>>>>>>>> 2.6 Million Euro funding!
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> --Noble Paul
>>>>>>>>>
>>>>>>>>> --------------------------------------------------------------

>>>>>>>>> -------
>>>>>>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>>>>>> For additional commands, e-mail: java-dev- 
>>>>>>>>> help@lucene.apache.org
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>> --Noble Paul
>>>>>>
>>>>>>
>>>>>> -----------------------------------------------------------------

>>>>>> ----
>>>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------ 
>>>>> ---
>>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------- 
>>>> --
>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message