lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Lu" <chris...@gmail.com>
Subject Re: ThreadLocal causing memory leak with J2EE applications
Date Wed, 10 Sep 2008 04:44:11 GMT
On J2EE environment, usually there is a searcher pool with several searchers
open.The speed to opening a large index for every user is not acceptable.

-- 
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got
2.6 Million Euro funding!

On Tue, Sep 9, 2008 at 9:03 PM, robert engels <rengels@ix.netcom.com> wrote:

> You need to close the searcher within the thread that is using it, in order
> to have it cleaned up quickly... usually right after you display the page of
> results.
> If you are keeping multiple searcher refs across multiple threads for
> paging/whatever, you have not coded it correctly.
>
> Imagine 10,000 users - storing a searcher for each one is not going to
> work...
>
> On Sep 9, 2008, at 10:21 PM, Chris Lu wrote:
>
> Right, in a sense I can not release it from another thread. But that's the
> problem.
>
> It's a J2EE environment, all threads are kind of equal. It's simply not
> possible to iterate through all threads to close the searcher, thus
> releasing the ThreadLocal cache.
> Unless Lucene is not recommended for J2EE environment, this has to be
> fixed.
>
> --
> Chris Lu
> -------------------------
> Instant Scalable Full-Text Search On Any Database/Application
> site: http://www.dbsight.net
> demo: http://search.dbsight.com
> Lucene Database Search in 3 minutes:
> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
> DBSight customer, a shopping comparison site, (anonymous per request) got
> 2.6 Million Euro funding!
>
> On Tue, Sep 9, 2008 at 8:14 PM, robert engels <rengels@ix.netcom.com>wrote:
>
>> Your code is not correct. You cannot release it on another thread - the
>> first thread may creating hundreds/thousands of instances before the other
>> thread ever runs...
>>
>> On Sep 9, 2008, at 10:10 PM, Chris Lu wrote:
>>
>> If I release it on the thread that's creating the searcher, by setting
>> searcher=null, everything is fine, the memory is released very cleanly.
>> My load test was to repeatedly create a searcher on a RAMDirectory and
>> release it on another thread. The test will quickly go to OOM after several
>> runs. I set the heap size to be 1024M, and the RAMDirectory is of size 250M.
>> Using some profiling tool, the used size simply stepped up pretty obviously
>> by 250M.
>>
>> I think we should not rely on something that's a "maybe" behavior,
>> especially for a general purpose library.
>>
>> Since it's a multi-threaded env, the thread that's creating the entries in
>> the LRU cache may not go away quickly(actually most, if not all, application
>> servers will try to reuse threads), so the LRU cache, which uses thread as
>> the key, can not be released, so the SegmentTermEnum which is in the same
>> class can not be released.
>>
>> And yes, I close the RAMDirectory, and the fileMap is released. I verified
>> that through the profiler by directly checking the values in the snapshot.
>>
>> Pretty sure the reference tree wasn't like this using code before this
>> commit, because after close the searcher in another thread, the RAMDirectory
>> totally disappeared from the memory snapshot.
>>
>> --
>> Chris Lu
>> -------------------------
>> Instant Scalable Full-Text Search On Any Database/Application
>> site: http://www.dbsight.net
>> demo: http://search.dbsight.com
>> Lucene Database Search in 3 minutes:
>> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
>> DBSight customer, a shopping comparison site, (anonymous per request) got
>> 2.6 Million Euro funding!
>>
>> On Tue, Sep 9, 2008 at 5:03 PM, Michael McCandless <
>> lucene@mikemccandless.com> wrote:
>>
>>>
>>> Chris Lu wrote:
>>>
>>>  The problem should be similar to what's talked about on this discussion.
>>>> http://lucene.markmail.org/message/keosgz2c2yjc7qre?q=ThreadLocal
>>>>
>>>
>>> The "rough" conclusion of that thread is that, technically, this isn't a
>>> memory leak but rather a "delayed freeing" problem.  Ie, it may take longer,
>>> possibly much longer, than you want for the memory to be freed.
>>>
>>>  There is a memory leak for Lucene search from Lucene-1195.(svn r659602,
>>>> May23,2008)
>>>>
>>>> This patch brings in a ThreadLocal cache to TermInfosReader.
>>>>
>>>
>>> One thing that confuses me: TermInfosReader was already using a
>>> ThreadLocal to cache the SegmentTermEnum instance.  What was added in this
>>> commit (for LUCENE-1195) was an LRU cache storing Term -> TermInfo
>>> instances.  But it seems like it's the SegmentTermEnum instance that you're
>>> tracing below.
>>>
>>>  It's usually recommended to keep the reader open, and reuse it when
>>>> possible. In a common J2EE application, the http requests are usually
>>>> handled by different threads. But since the cache is ThreadLocal, the
>>>> cache
>>>> are not really usable by other threads. What's worse, the cache can not
>>>> be
>>>> cleared by another thread!
>>>>
>>>> This leak is not so obvious usually. But my case is using RAMDirectory,
>>>> having several hundred megabytes. So one un-released resource is obvious
>>>> to
>>>> me.
>>>>
>>>> Here is the reference tree:
>>>> org.apache.lucene.store.RAMDirectory
>>>>  |- directory of org.apache.lucene.store.RAMFile
>>>>     |- file of org.apache.lucene.store.RAMInputStream
>>>>         |- base of
>>>> org.apache.lucene.index.CompoundFileReader$CSIndexInput
>>>>             |- input of org.apache.lucene.index.SegmentTermEnum
>>>>                 |- value of java.lang.ThreadLocal$ThreadLocalMap$Entry
>>>>
>>>
>>> So you have a RAMDir that has several hundred MB stored in it, that
>>> you're done with yet through this path Lucene is keeping it alive?
>>>
>>> Did you close the RAMDir?  (which will null its fileMap and should also
>>> free your memory).
>>>
>>> Also, that reference tree doesn't show the ThreadResources class that was
>>> added in that commit -- are you sure this reference tree wasn't before the
>>> commit?
>>>
>>> Mike
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>
>>>
>>
>>
>> --
>> Chris Lu
>> -------------------------
>> Instant Scalable Full-Text Search On Any Database/Application
>> site: http://www.dbsight.net
>> demo: http://search.dbsight.com
>> Lucene Database Search in 3 minutes:
>> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
>> DBSight customer, a shopping comparison site, (anonymous per request) got
>> 2.6 Million Euro funding!
>>
>>
>>
>
>
>
>

Mime
View raw message