Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 93538 invoked from network); 10 Sep 2008 03:11:26 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 10 Sep 2008 03:11:26 -0000 Received: (qmail 16043 invoked by uid 500); 10 Sep 2008 03:11:22 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 16001 invoked by uid 500); 10 Sep 2008 03:11:21 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 15992 invoked by uid 99); 10 Sep 2008 03:11:21 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Sep 2008 20:11:21 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of chris.lu@gmail.com designates 209.85.200.172 as permitted sender) Received: from [209.85.200.172] (HELO wf-out-1314.google.com) (209.85.200.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Sep 2008 03:10:22 +0000 Received: by wf-out-1314.google.com with SMTP id 28so2699492wfc.20 for ; Tue, 09 Sep 2008 20:10:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=+rRe9PXoSSFTJiyuIM4b5FWoGe2MG/unnROMtDsr/+o=; b=S0w5TiqiBUSEU8Dk5N2auY01PCouGyo9ycn0h5Pq8Mu8a/HKIzC1CeIotMzYS3h07g YFoK6C9K6it+p+RFXWuWxKiBmnz3usGiT8z4I9Ft00NS0eqVfgE7RA7SvwUj5CHrAEWm LRh4JNSdbiib+mdSgxz6x9pu2KCzoZqERJV8o= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=mAplyqNaapjajNBWwSaycjf2l1tdZJPcyjwa95Oi0zI/4wLluF0i6hBNG15sslitEc YaaAc2y8I4S9BzhHvq6xgveM9CJGSOmdQ1r4HIcjVvzWNW0ygi3gHnkwYM1IVstdPoDi vwBnH9akPPHFR/o3v4NQY6wudH/Dblxl72hck= Received: by 10.142.179.12 with SMTP id b12mr206749wff.282.1221016243796; Tue, 09 Sep 2008 20:10:43 -0700 (PDT) Received: by 10.142.111.8 with HTTP; Tue, 9 Sep 2008 20:10:43 -0700 (PDT) Message-ID: <6e3ae6310809092010o1c184fbbo1364e6ac16e40c65@mail.gmail.com> Date: Tue, 9 Sep 2008 20:10:43 -0700 From: "Chris Lu" To: java-dev@lucene.apache.org Subject: Re: ThreadLocal causing memory leak with J2EE applications In-Reply-To: <0712E8D6-1A9A-48B3-9BDC-7B9B01387F81@mikemccandless.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_139415_22841310.1221016243785" References: <6e3ae6310809091157j7a9fe46bxcc31f6e63305fcdc@mail.gmail.com> <0712E8D6-1A9A-48B3-9BDC-7B9B01387F81@mikemccandless.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_139415_22841310.1221016243785 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline If I release it on the thread that's creating the searcher, by setting searcher=null, everything is fine, the memory is released very cleanly. My load test was to repeatedly create a searcher on a RAMDirectory and release it on another thread. The test will quickly go to OOM after several runs. I set the heap size to be 1024M, and the RAMDirectory is of size 250M. Using some profiling tool, the used size simply stepped up pretty obviously by 250M. I think we should not rely on something that's a "maybe" behavior, especially for a general purpose library. Since it's a multi-threaded env, the thread that's creating the entries in the LRU cache may not go away quickly(actually most, if not all, application servers will try to reuse threads), so the LRU cache, which uses thread as the key, can not be released, so the SegmentTermEnum which is in the same class can not be released. And yes, I close the RAMDirectory, and the fileMap is released. I verified that through the profiler by directly checking the values in the snapshot. Pretty sure the reference tree wasn't like this using code before this commit, because after close the searcher in another thread, the RAMDirectory totally disappeared from the memory snapshot. -- Chris Lu ------------------------- Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding! On Tue, Sep 9, 2008 at 5:03 PM, Michael McCandless < lucene@mikemccandless.com> wrote: > > Chris Lu wrote: > > The problem should be similar to what's talked about on this discussion. >> http://lucene.markmail.org/message/keosgz2c2yjc7qre?q=ThreadLocal >> > > The "rough" conclusion of that thread is that, technically, this isn't a > memory leak but rather a "delayed freeing" problem. Ie, it may take longer, > possibly much longer, than you want for the memory to be freed. > > There is a memory leak for Lucene search from Lucene-1195.(svn r659602, >> May23,2008) >> >> This patch brings in a ThreadLocal cache to TermInfosReader. >> > > One thing that confuses me: TermInfosReader was already using a ThreadLocal > to cache the SegmentTermEnum instance. What was added in this commit (for > LUCENE-1195) was an LRU cache storing Term -> TermInfo instances. But it > seems like it's the SegmentTermEnum instance that you're tracing below. > > It's usually recommended to keep the reader open, and reuse it when >> possible. In a common J2EE application, the http requests are usually >> handled by different threads. But since the cache is ThreadLocal, the >> cache >> are not really usable by other threads. What's worse, the cache can not be >> cleared by another thread! >> >> This leak is not so obvious usually. But my case is using RAMDirectory, >> having several hundred megabytes. So one un-released resource is obvious >> to >> me. >> >> Here is the reference tree: >> org.apache.lucene.store.RAMDirectory >> |- directory of org.apache.lucene.store.RAMFile >> |- file of org.apache.lucene.store.RAMInputStream >> |- base of org.apache.lucene.index.CompoundFileReader$CSIndexInput >> |- input of org.apache.lucene.index.SegmentTermEnum >> |- value of java.lang.ThreadLocal$ThreadLocalMap$Entry >> > > So you have a RAMDir that has several hundred MB stored in it, that you're > done with yet through this path Lucene is keeping it alive? > > Did you close the RAMDir? (which will null its fileMap and should also > free your memory). > > Also, that reference tree doesn't show the ThreadResources class that was > added in that commit -- are you sure this reference tree wasn't before the > commit? > > Mike > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-dev-help@lucene.apache.org > > -- Chris Lu ------------------------- Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding! ------=_Part_139415_22841310.1221016243785 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline
If I release it on the thread that's creating the searcher, by setting searcher=null, everything is fine, the memory is released very cleanly.
My load test was to repeatedly create a searcher on a RAMDirectory and release it on another thread. The test will quickly go to OOM after several runs. I set the heap size to be 1024M, and the RAMDirectory is of size 250M. Using some profiling tool, the used size simply stepped up pretty obviously by 250M.

I think we should not rely on something that's a "maybe" behavior, especially for a general purpose library.

Since it's a multi-threaded env, the thread that's creating the entries in the LRU cache may not go away quickly(actually most, if not all, application servers will try to reuse threads), so the LRU cache, which uses thread as the key, can not be released, so the SegmentTermEnum which is in the same class can not be released.

And yes, I close the RAMDirectory, and the fileMap is released. I verified that through the profiler by directly checking the values in the snapshot.

Pretty sure the reference tree wasn't like this using code before this commit, because after close the searcher in another thread, the RAMDirectory totally disappeared from the memory snapshot.

-- 
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding!

On Tue, Sep 9, 2008 at 5:03 PM, Michael McCandless <lucene@mikemccandless.com> wrote:

Chris Lu wrote:

The problem should be similar to what's talked about on this discussion.
http://lucene.markmail.org/message/keosgz2c2yjc7qre?q=ThreadLocal

The "rough" conclusion of that thread is that, technically, this isn't a memory leak but rather a "delayed freeing" problem.  Ie, it may take longer, possibly much longer, than you want for the memory to be freed.


There is a memory leak for Lucene search from Lucene-1195.(svn r659602, May23,2008)

This patch brings in a ThreadLocal cache to TermInfosReader.

One thing that confuses me: TermInfosReader was already using a ThreadLocal to cache the SegmentTermEnum instance.  What was added in this commit (for LUCENE-1195) was an LRU cache storing Term -> TermInfo instances.  But it seems like it's the SegmentTermEnum instance that you're tracing below.


It's usually recommended to keep the reader open, and reuse it when
possible. In a common J2EE application, the http requests are usually
handled by different threads. But since the cache is ThreadLocal, the cache
are not really usable by other threads. What's worse, the cache can not be
cleared by another thread!

This leak is not so obvious usually. But my case is using RAMDirectory,
having several hundred megabytes. So one un-released resource is obvious to
me.

Here is the reference tree:
org.apache.lucene.store.RAMDirectory
 |- directory of org.apache.lucene.store.RAMFile
    |- file of org.apache.lucene.store.RAMInputStream
        |- base of org.apache.lucene.index.CompoundFileReader$CSIndexInput
            |- input of org.apache.lucene.index.SegmentTermEnum
                |- value of java.lang.ThreadLocal$ThreadLocalMap$Entry

So you have a RAMDir that has several hundred MB stored in it, that you're done with yet through this path Lucene is keeping it alive?

Did you close the RAMDir?  (which will null its fileMap and should also free your memory).

Also, that reference tree doesn't show the ThreadResources class that was added in that commit -- are you sure this reference tree wasn't before the commit?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org




--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding!
------=_Part_139415_22841310.1221016243785--