lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert engels <reng...@ix.netcom.com>
Subject Re: ThreadLocal causing memory leak with J2EE applications
Date Wed, 10 Sep 2008 04:03:27 GMT
You need to close the searcher within the thread that is using it, in  
order to have it cleaned up quickly... usually right after you  
display the page of results.

If you are keeping multiple searcher refs across multiple threads for  
paging/whatever, you have not coded it correctly.

Imagine 10,000 users - storing a searcher for each one is not going  
to work...

On Sep 9, 2008, at 10:21 PM, Chris Lu wrote:

> Right, in a sense I can not release it from another thread. But  
> that's the problem.
>
> It's a J2EE environment, all threads are kind of equal. It's simply  
> not possible to iterate through all threads to close the searcher,  
> thus releasing the ThreadLocal cache.
> Unless Lucene is not recommended for J2EE environment, this has to  
> be fixed.
>
> -- 
> Chris Lu
> -------------------------
> Instant Scalable Full-Text Search On Any Database/Application
> site: http://www.dbsight.net
> demo: http://search.dbsight.com
> Lucene Database Search in 3 minutes: http://wiki.dbsight.com/ 
> index.php?title=Create_Lucene_Database_Search_in_3_minutes
> DBSight customer, a shopping comparison site, (anonymous per  
> request) got 2.6 Million Euro funding!
>
>
> On Tue, Sep 9, 2008 at 8:14 PM, robert engels  
> <rengels@ix.netcom.com> wrote:
> Your code is not correct. You cannot release it on another thread -  
> the first thread may creating hundreds/thousands of instances  
> before the other thread ever runs...
>
> On Sep 9, 2008, at 10:10 PM, Chris Lu wrote:
>
>> If I release it on the thread that's creating the searcher, by  
>> setting searcher=null, everything is fine, the memory is released  
>> very cleanly.
>> My load test was to repeatedly create a searcher on a RAMDirectory  
>> and release it on another thread. The test will quickly go to OOM  
>> after several runs. I set the heap size to be 1024M, and the  
>> RAMDirectory is of size 250M. Using some profiling tool, the used  
>> size simply stepped up pretty obviously by 250M.
>>
>> I think we should not rely on something that's a "maybe" behavior,  
>> especially for a general purpose library.
>>
>> Since it's a multi-threaded env, the thread that's creating the  
>> entries in the LRU cache may not go away quickly(actually most, if  
>> not all, application servers will try to reuse threads), so the  
>> LRU cache, which uses thread as the key, can not be released, so  
>> the SegmentTermEnum which is in the same class can not be released.
>>
>> And yes, I close the RAMDirectory, and the fileMap is released. I  
>> verified that through the profiler by directly checking the values  
>> in the snapshot.
>>
>> Pretty sure the reference tree wasn't like this using code before  
>> this commit, because after close the searcher in another thread,  
>> the RAMDirectory totally disappeared from the memory snapshot.
>>
>> -- 
>> Chris Lu
>> -------------------------
>> Instant Scalable Full-Text Search On Any Database/Application
>> site: http://www.dbsight.net
>> demo: http://search.dbsight.com
>> Lucene Database Search in 3 minutes: http://wiki.dbsight.com/ 
>> index.php?title=Create_Lucene_Database_Search_in_3_minutes
>> DBSight customer, a shopping comparison site, (anonymous per  
>> request) got 2.6 Million Euro funding!
>>
>> On Tue, Sep 9, 2008 at 5:03 PM, Michael McCandless  
>> <lucene@mikemccandless.com> wrote:
>>
>> Chris Lu wrote:
>>
>> The problem should be similar to what's talked about on this  
>> discussion.
>> http://lucene.markmail.org/message/keosgz2c2yjc7qre?q=ThreadLocal
>>
>> The "rough" conclusion of that thread is that, technically, this  
>> isn't a memory leak but rather a "delayed freeing" problem.  Ie,  
>> it may take longer, possibly much longer, than you want for the  
>> memory to be freed.
>>
>>
>> There is a memory leak for Lucene search from Lucene-1195.(svn  
>> r659602, May23,2008)
>>
>> This patch brings in a ThreadLocal cache to TermInfosReader.
>>
>> One thing that confuses me: TermInfosReader was already using a  
>> ThreadLocal to cache the SegmentTermEnum instance.  What was added  
>> in this commit (for LUCENE-1195) was an LRU cache storing Term ->  
>> TermInfo instances.  But it seems like it's the SegmentTermEnum  
>> instance that you're tracing below.
>>
>>
>> It's usually recommended to keep the reader open, and reuse it when
>> possible. In a common J2EE application, the http requests are usually
>> handled by different threads. But since the cache is ThreadLocal,  
>> the cache
>> are not really usable by other threads. What's worse, the cache  
>> can not be
>> cleared by another thread!
>>
>> This leak is not so obvious usually. But my case is using  
>> RAMDirectory,
>> having several hundred megabytes. So one un-released resource is  
>> obvious to
>> me.
>>
>> Here is the reference tree:
>> org.apache.lucene.store.RAMDirectory
>>  |- directory of org.apache.lucene.store.RAMFile
>>     |- file of org.apache.lucene.store.RAMInputStream
>>         |- base of org.apache.lucene.index.CompoundFileReader 
>> $CSIndexInput
>>             |- input of org.apache.lucene.index.SegmentTermEnum
>>                 |- value of java.lang.ThreadLocal$ThreadLocalMap 
>> $Entry
>>
>> So you have a RAMDir that has several hundred MB stored in it,  
>> that you're done with yet through this path Lucene is keeping it  
>> alive?
>>
>> Did you close the RAMDir?  (which will null its fileMap and should  
>> also free your memory).
>>
>> Also, that reference tree doesn't show the ThreadResources class  
>> that was added in that commit -- are you sure this reference tree  
>> wasn't before the commit?
>>
>> Mike
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>>
>> -- 
>> Chris Lu
>> -------------------------
>> Instant Scalable Full-Text Search On Any Database/Application
>> site: http://www.dbsight.net
>> demo: http://search.dbsight.com
>> Lucene Database Search in 3 minutes: http://wiki.dbsight.com/ 
>> index.php?title=Create_Lucene_Database_Search_in_3_minutes
>> DBSight customer, a shopping comparison site, (anonymous per  
>> request) got 2.6 Million Euro funding!
>
>
>
>


Mime
View raw message