Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 84516 invoked from network); 10 Sep 2008 06:09:14 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 10 Sep 2008 06:09:14 -0000 Received: (qmail 28929 invoked by uid 500); 10 Sep 2008 06:09:10 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 28882 invoked by uid 500); 10 Sep 2008 06:09:09 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 28873 invoked by uid 99); 10 Sep 2008 06:09:09 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Sep 2008 23:09:09 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of chris.lu@gmail.com designates 209.85.200.170 as permitted sender) Received: from [209.85.200.170] (HELO wf-out-1314.google.com) (209.85.200.170) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Sep 2008 06:08:10 +0000 Received: by wf-out-1314.google.com with SMTP id 28so2764680wfc.20 for ; Tue, 09 Sep 2008 23:08:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:in-reply-to:mime-version:content-type:references; bh=3eZgOejCWV4CZawaFXbw8eWr/hzxAGEAhmtik5/zy/E=; b=KttyhNm7mIf/XTCZJlLokC6Pn+hYL1T7fg0f2mR2DZICHYJ588KljKwI/3sFkCB/Md vKOBRyLYXuaUMp8ZzEeLTtI/e/Udbq+Oxx9oI860AHCCyCDig0Yhq4oiadp+VNOlGrBN bigoZFHvLl013zVRNfAlbvGwCOFnqSRZxYQ7M= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version :content-type:references; b=oXPMlMD3MJYq8Pvt8nCPOmacjhcdTcpFAMA8DPECjl4qhEL5kEQEWPYgMh8IMMgpmR zzoMu8Dv7A2zq994SFuE9pXlToVCbujpP2P/ENayWREgOD8DfauFutCVosVoQXUoDEk1 nO6DLZ5mO1EXcSe+cfbO12fE05WamIms8PAO8= Received: by 10.142.229.4 with SMTP id b4mr271336wfh.143.1221026911376; Tue, 09 Sep 2008 23:08:31 -0700 (PDT) Received: by 10.142.111.8 with HTTP; Tue, 9 Sep 2008 23:08:31 -0700 (PDT) Message-ID: <6e3ae6310809092308r1cbe0c6bm4ab0a32d8245a13d@mail.gmail.com> Date: Tue, 9 Sep 2008 23:08:31 -0700 From: "Chris Lu" To: java-dev@lucene.apache.org Subject: Re: ThreadLocal causing memory leak with J2EE applications In-Reply-To: <312CE5CC-5E21-464A-9559-083AE9C14050@ix.netcom.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_140637_27832761.1221026911368" References: <6e3ae6310809091157j7a9fe46bxcc31f6e63305fcdc@mail.gmail.com> <0712E8D6-1A9A-48B3-9BDC-7B9B01387F81@mikemccandless.com> <6e3ae6310809092010o1c184fbbo1364e6ac16e40c65@mail.gmail.com> <6e3ae6310809092021n3943a9f0xdfc3d62dbf0d2833@mail.gmail.com> <98A7B5A6-2ECA-4640-815D-17AE7EB5445E@ix.netcom.com> <6e3ae6310809092144v153d634ejea5c2372f72d1951@mail.gmail.com> <1C021C48-DC16-4DE1-BB7C-B89B67A5A685@ix.netcom.com> <312CE5CC-5E21-464A-9559-083AE9C14050@ix.netcom.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_140637_27832761.1221026911368 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Yes. In the end, the IndexReader holds a large object via ThreadLocal. On the one hand, I should pool IndexReader because opening IndexReader cost a lot. On the other hand, I should not pool IndexReader because some resources are cached via ThreadLocal, and unless all threads closes the IndexReader in the pool. These contradictory requirements are caused by the ThreadLocal LRU cache in the LUCENE-1195. My only solution is to revert back this particular patch. -- Chris Lu ------------------------- Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes DBSight customer, a shopping comparison site, (anonymous per request) got 2.6 Million Euro funding! On Tue, Sep 9, 2008 at 10:46 PM, robert engels wrote: > As a follow-up, the SegmentTermEnum does contain an IndexInput and based on > your configuration (buffer sizes, eg) this could be a large object, so you > do need to be careful ! > > On Sep 10, 2008, at 12:14 AM, robert engels wrote: > > A searcher uses an IndexReader - the IndexReader is slow to open, not a > Searcher. And searchers can share an IndexReader. > You want to create a single shared (across all threads/users) IndexReader > (usually), and create an Searcher as needed and dispose. It is VERY CHEAP > to create the Searcher. > > I am fairly certain the javadoc on Searcher is incorrect. The warning " > For performance reasons it is recommended to open only one IndexSearcher > and use it for all of your searches" is not true in the case where an > IndexReader is passed to the ctor. > > Any caching should USUALLY be performed at the IndexReader level. > > You are most likely using the "path" ctor, and that is the source of your > problems, as multiple IndexReader instances are being created, and thus the > memory use. > > > On Sep 9, 2008, at 11:44 PM, Chris Lu wrote: > > On J2EE environment, usually there is a searcher pool with several > searchers open.The speed to opening a large index for every user is not > acceptable. > > -- > Chris Lu > ------------------------- > Instant Scalable Full-Text Search On Any Database/Application > site: http://www.dbsight.net > demo: http://search.dbsight.com > Lucene Database Search in 3 minutes: > http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes > DBSight customer, a shopping comparison site, (anonymous per request) got > 2.6 Million Euro funding! > > On Tue, Sep 9, 2008 at 9:03 PM, robert engels wrote: > >> You need to close the searcher within the thread that is using it, in >> order to have it cleaned up quickly... usually right after you display the >> page of results. >> If you are keeping multiple searcher refs across multiple threads for >> paging/whatever, you have not coded it correctly. >> >> Imagine 10,000 users - storing a searcher for each one is not going to >> work... >> >> On Sep 9, 2008, at 10:21 PM, Chris Lu wrote: >> >> Right, in a sense I can not release it from another thread. But that's the >> problem. >> >> It's a J2EE environment, all threads are kind of equal. It's simply not >> possible to iterate through all threads to close the searcher, thus >> releasing the ThreadLocal cache. >> Unless Lucene is not recommended for J2EE environment, this has to be >> fixed. >> >> -- >> Chris Lu >> ------------------------- >> Instant Scalable Full-Text Search On Any Database/Application >> site: http://www.dbsight.net >> demo: http://search.dbsight.com >> Lucene Database Search in 3 minutes: >> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes >> DBSight customer, a shopping comparison site, (anonymous per request) got >> 2.6 Million Euro funding! >> >> On Tue, Sep 9, 2008 at 8:14 PM, robert engels wrote: >> >>> Your code is not correct. You cannot release it on another thread - the >>> first thread may creating hundreds/thousands of instances before the other >>> thread ever runs... >>> >>> On Sep 9, 2008, at 10:10 PM, Chris Lu wrote: >>> >>> If I release it on the thread that's creating the searcher, by setting >>> searcher=null, everything is fine, the memory is released very cleanly. >>> My load test was to repeatedly create a searcher on a RAMDirectory and >>> release it on another thread. The test will quickly go to OOM after several >>> runs. I set the heap size to be 1024M, and the RAMDirectory is of size 250M. >>> Using some profiling tool, the used size simply stepped up pretty obviously >>> by 250M. >>> >>> I think we should not rely on something that's a "maybe" behavior, >>> especially for a general purpose library. >>> >>> Since it's a multi-threaded env, the thread that's creating the entries >>> in the LRU cache may not go away quickly(actually most, if not all, >>> application servers will try to reuse threads), so the LRU cache, which uses >>> thread as the key, can not be released, so the SegmentTermEnum which is in >>> the same class can not be released. >>> >>> And yes, I close the RAMDirectory, and the fileMap is released. I >>> verified that through the profiler by directly checking the values in the >>> snapshot. >>> >>> Pretty sure the reference tree wasn't like this using code before this >>> commit, because after close the searcher in another thread, the RAMDirectory >>> totally disappeared from the memory snapshot. >>> >>> -- >>> Chris Lu >>> ------------------------- >>> Instant Scalable Full-Text Search On Any Database/Application >>> site: http://www.dbsight.net >>> demo: http://search.dbsight.com >>> Lucene Database Search in 3 minutes: >>> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes >>> DBSight customer, a shopping comparison site, (anonymous per request) got >>> 2.6 Million Euro funding! >>> >>> On Tue, Sep 9, 2008 at 5:03 PM, Michael McCandless < >>> lucene@mikemccandless.com> wrote: >>> >>>> >>>> Chris Lu wrote: >>>> >>>> The problem should be similar to what's talked about on this >>>>> discussion. >>>>> http://lucene.markmail.org/message/keosgz2c2yjc7qre?q=ThreadLocal >>>>> >>>> >>>> The "rough" conclusion of that thread is that, technically, this isn't a >>>> memory leak but rather a "delayed freeing" problem. Ie, it may take longer, >>>> possibly much longer, than you want for the memory to be freed. >>>> >>>> There is a memory leak for Lucene search from Lucene-1195.(svn r659602, >>>>> May23,2008) >>>>> >>>>> This patch brings in a ThreadLocal cache to TermInfosReader. >>>>> >>>> >>>> One thing that confuses me: TermInfosReader was already using a >>>> ThreadLocal to cache the SegmentTermEnum instance. What was added in this >>>> commit (for LUCENE-1195) was an LRU cache storing Term -> TermInfo >>>> instances. But it seems like it's the SegmentTermEnum instance that you're >>>> tracing below. >>>> >>>> It's usually recommended to keep the reader open, and reuse it when >>>>> possible. In a common J2EE application, the http requests are usually >>>>> handled by different threads. But since the cache is ThreadLocal, the >>>>> cache >>>>> are not really usable by other threads. What's worse, the cache can not >>>>> be >>>>> cleared by another thread! >>>>> >>>>> This leak is not so obvious usually. But my case is using RAMDirectory, >>>>> having several hundred megabytes. So one un-released resource is >>>>> obvious to >>>>> me. >>>>> >>>>> Here is the reference tree: >>>>> org.apache.lucene.store.RAMDirectory >>>>> |- directory of org.apache.lucene.store.RAMFile >>>>> |- file of org.apache.lucene.store.RAMInputStream >>>>> |- base of >>>>> org.apache.lucene.index.CompoundFileReader$CSIndexInput >>>>> |- input of org.apache.lucene.index.SegmentTermEnum >>>>> |- value of java.lang.ThreadLocal$ThreadLocalMap$Entry >>>>> >>>> >>>> So you have a RAMDir that has several hundred MB stored in it, that >>>> you're done with yet through this path Lucene is keeping it alive? >>>> >>>> Did you close the RAMDir? (which will null its fileMap and should also >>>> free your memory). >>>> >>>> Also, that reference tree doesn't show the ThreadResources class that >>>> was added in that commit -- are you sure this reference tree wasn't before >>>> the commit? >>>> >>>> Mike >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org >>>> For additional commands, e-mail: java-dev-help@lucene.apache.org >>>> >>>> >>> >>> >>> -- >>> Chris Lu >>> ------------------------- >>> Instant Scalable Full-Text Search On Any Database/Application >>> site: http://www.dbsight.net >>> demo: http://search.dbsight.com >>> Lucene Database Search in 3 minutes: >>> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes >>> DBSight customer, a shopping comparison site, (anonymous per request) got >>> 2.6 Million Euro funding! >>> >>> >>> >> >> >> >> > > > > ------=_Part_140637_27832761.1221026911368 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline
Yes. In the end, the IndexReader holds a large object via = ThreadLocal.

On the one hand, I should pool IndexReader = because opening IndexReader cost a lot.
On the other hand, I shou= ld not pool IndexReader because some resources are cached via ThreadLocal, = and unless all threads closes the IndexReader in the pool.

These contradictory requirements are caused by the Thre= adLocal LRU cache in the LUCENE-1195.

My only= solution is to revert back this particular patch.

-- 
= Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Datab= ase/Application
site: http://www.dbsi= ght.net
demo: http://search.db= sight.com
Lucene Database Search in 3 minutes: http://wiki.dbsi= ght.com/index.php?title=3DCreate_Lucene_Database_Search_in_3_minutesDBSight customer, a shopping comparison site, (anonymous per request) got = 2.6 Million Euro funding!

On Tue, Sep 9, 2008 at 10:46 PM, robert enge= ls <rengels@i= x.netcom.com> wrote:
As a follow-up, the SegmentTermEnum does contain an IndexInput and based on= your configuration (buffer sizes, eg) this could be a large object, so you= do need to be careful !

On Sep 10, 2008, at 12:14 AM, robert engels wrote:
=
A searcher uses an IndexReader - the IndexRe= ader is slow to open, not a Searcher. And searchers can share an IndexReade= r.

You want to create a single shared (across all threads/users= ) IndexReader (usually), and create an Searcher as needed and dispose. &nbs= p;It is VERY CHEAP to create the Searcher.

I am fa= irly certain the javadoc on Searcher is incorrect.  The warning "= For performance = reasons it is recommended to open only one IndexSearcher and use it for<= font face=3D"Monaco" size=3D"3" style=3D"font:11.0px Monaco"> all of your<= font face=3D"Monaco" size=3D"3" style=3D"font:11.0px Monaco"> searches" is not true in the case where an IndexReader is pa= ssed to the ctor.

Any caching should USUALLY be performed at the Ind= exReader level.

You are most likely using the &quo= t;path" ctor, and that is the source of your problems, as multiple Ind= exReader instances are being created, and thus the memory use.


On Sep 9, 2008= , at 11:44 PM, Chris Lu wrote:

On J2EE environment, usually there is a searcher pool with several= searchers open.
The speed to opening a large index for every user is not acceptable.

-- 
Chris Lu
-------------------------
Instant S= calable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search= .dbsight.com
Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=3DCreate_Luce= ne_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got = 2.6 Million Euro funding!

On Tue, Sep 9, = 2008 at 9:03 PM, robert engels <rengels@ix.netcom.com> w= rote:
You ne= ed to close the searcher within the thread that is using it, in order to ha= ve it cleaned up quickly... usually right after you display the page of res= ults.

If you are keeping multiple searcher refs across multiple th= reads for paging/whatever, you have not coded it correctly.

=
Imagine 10,000 users - storing a searcher for each one is not go= ing to work...

On Sep 9, 2008, at 10:21 PM, C= hris Lu wrote:

Ri= ght, in a sense I can not release it from another thread. But that's th= e problem.

It's a J2EE environment, all threads are kind of eq= ual. It's simply not possible to iterate through all threads to close t= he searcher, thus releasing the ThreadLocal cache.
Unless Lucene= is not recommended for J2EE environment, this has to be fixed.

-- 
Chris Lu
-------------------------
Instant Scalable F= ull-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com<= /a>
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=3DCreate_Lucene_Database_Search_i= n_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got = 2.6 Million Euro funding!


On Tue, Se= p 9, 2008 at 8:14 PM, robert engels <rengels@ix.netcom.com> wrote:
Your c= ode is not correct. You cannot release it on another thread - the first thr= ead may creating hundreds/thousands of instances before the other thread ev= er runs...

On Sep 9, 2008, at 10:10 PM, Chris Lu = wrote:

If I releas= e it on the thread that's creating the searcher, by setting searcher=3D= null, everything is fine, the memory is released very cleanly.
My load test was to repeatedly create a searcher on a RAMDirectory an= d release it on another thread. The test will quickly go to OOM after sever= al runs. I set the heap size to be 1024M, and the RAMDirectory is of size 2= 50M. Using some profiling tool, the used size simply stepped up pretty obvi= ously by 250M.

I think we should not rely on something that'= ;s a "maybe" behavior, especially for a general purpose library.<= /div>

Since it's a multi-threaded env, the thread th= at's creating the entries in the LRU cache may not go away quickly(actu= ally most, if not all, application servers will try to reuse threads), so t= he LRU cache, which uses thread as the key, can not be released, so th= e SegmentTermEnum which is in the same class can not be released.

And yes, I close the RAMDirectory, and the fileM= ap is released. I verified that through the profiler by directly checking t= he values in the snapshot.

Pretty sure the referen= ce tree wasn't like this using code before this commit, because after c= lose the searcher in another thread, the RAMDirectory totally disappeared f= rom the memory snapshot.

-- 
Chris Lu
-------------------------
Instant Scal= able Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight= .com
Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=3DCreate_Lucene_Database_Search_= in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got = 2.6 Million Euro funding!

The "rough" conclusion of that thread i= s that, technically, this isn't a memory leak but rather a "delaye= d freeing" problem.  Ie, it may take longer, possibly much longer= , than you want for the memory to be freed.


There is a memory leak for Lucen= e search from Lucene-1195.(svn r659602, May23,2008)

This patch bri= ngs in a ThreadLocal cache to TermInfosReader.

One thing that confuses me: TermInfosReader was a= lready using a ThreadLocal to cache the SegmentTermEnum instance.  Wha= t was added in this commit (for LUCENE-1195) was an LRU cache storing Term = -> TermInfo instances.  But it seems like it's the SegmentTermE= num instance that you're tracing below.


It's usually recommended to = keep the reader open, and reuse it when
possible. In a common J2EE appl= ication, the http requests are usually
handled by different threads. But since the cache is ThreadLocal, the cach= e
are not really usable by other threads. What's worse, the cache c= an not be
cleared by another thread!

This leak is not so obvio= us usually. But my case is using RAMDirectory,
having several hundred megabytes. So one un-released resource is obvious t= o
me.

Here is the reference tree:
org.apache.lucene.store.= RAMDirectory
 |- directory of org.apache.lucene.store.RAMFile
=     |- file of org.apache.lucene.store.RAMInputStream
        |- base of org.apache.lucene.index.CompoundFil= eReader$CSIndexInput
            |- input= of org.apache.lucene.index.SegmentTermEnum
       =         |- value of java.lang.ThreadLocal$ThreadLocalM= ap$Entry

So you have a RAMDir that has several hundred MB = stored in it, that you're done with yet through this path Lucene is kee= ping it alive?

Did you close the RAMDir?  (which will null it= s fileMap and should also free your memory).

Also, that reference tree doesn't show the ThreadResources class = that was added in that commit -- are you sure this reference tree wasn'= t before the commit?

Mike

-------------------------------= --------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
Fo= r additional commands, e-mail: java-dev-help@lucene.apache.org




--
Chris Lu
-----= --------------------
Instant Scalable Full-Text Search On Any Database/A= pplication
site: ht= tp://www.dbsight.net
demo: http://searc= h.dbsight.com
Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=3DCreate_Luc= ene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got = 2.6 Million Euro funding!







=




------=_Part_140637_27832761.1221026911368--