lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Engels" <reng...@ix.netcom.com>
Subject RE: [jira] Created: (LUCENE-529) TermInfosReader and other + instance ThreadLocal => transient/odd memory leaks => OutOfMemoryException
Date Wed, 22 Mar 2006 19:17:22 GMT
Creating and destroying threads is one of the worst performing operations,
and should be avoided at ALMOST all costs.

I do not see this problem in my server impl of Lucene, internally
multithreaded, and accessed via multiple threads from a Tomcat server. I
have to assume many (most?) users of Lucene are doing so in a multithreaded
server environment.

I reviewed the bugs in java.sun related to memory leaks with ThreadLocal's.
I don't think any of them apply in this case.

Maybe you could provide a simplified ThreadLocal testcase that demonstrates
the 'out of memory' condition?

Are you sure that you do not have a modified version of Lucene that is
somehow maintain a reference back to the ThreadLocal from the ThreadLocal's
value, as this is a known JDK issue
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6254531 I don't see this
bug as being applicable to the 1.9.1 or 1.4.3 code.

Did you try running your server using 1.4.3? (our server code is based off
the 1.4.3 codeset at this time).


-----Original Message-----
From: Andy Hind [mailto:andy.hind@alfresco.org]
Sent: Wednesday, March 22, 2006 12:48 PM
To: java-dev@lucene.apache.org; rengels@ix.netcom.com
Subject: RE: [jira] Created: (LUCENE-529) TermInfosReader and other +
instance ThreadLocal => transient/odd memory leaks =>
OutOfMemoryException



For every IndexReader that is opened
- there is one SegmentReader for every segment in the index
   - with its thread local
   - for each of these there is a TermInfosReader + its thread local.

So I get 2 * (no of index segments) thread locals.

I am creating index readers for a main index and transactional updates
and layering the two. At the moment this is an issue, under stress
testing, using tomcat, with thread pooling, with a pretty big changing
index, left running for a few hours, it blows up.

Thread locals are also used in other areas of the app.

It would be better if threads were created and destroyed!

It is certainly not insignificant for me and gives a JVM that creeps up
in size pretty steadily over time.

I have fixed this issue locally in the code and it works.

Regards

Andy




-----Original Message-----
From: Robert Engels [mailto:rengels@ix.netcom.com]
Sent: 22 March 2006 17:46
To: java-dev@lucene.apache.org
Subject: RE: [jira] Created: (LUCENE-529) TermInfosReader and other +
instance ThreadLocal => transient/odd memory leaks =>
OutOfMemoryException

There was a small mistake - there is a single TermInfoReader per
segment.

-----Original Message-----
From: Robert Engels [mailto:rengels@ix.netcom.com]
Sent: Wednesday, March 22, 2006 11:37 AM
To: java-dev@lucene.apache.org
Subject: RE: [jira] Created: (LUCENE-529) TermInfosReader and other +
instance ThreadLocal => transient/odd memory leaks =>
OutOfMemoryException


There is only a single TermInfoReader per index. In order to share this
instance with multiple threads, and avoid the overhead of creating new
enumerators for each request, the enumerator for the thread is stored in
a thread local. Normally, in a server application, threads are pooled,
so new threads are not constantly created and destroyed, so the memory
leak is insiginificant.

The same reasoning holds true for the SegmentReader class.


-----Original Message-----
From: Andy Hind (JIRA) [mailto:jira@apache.org]
Sent: Wednesday, March 22, 2006 11:07 AM
To: java-dev@lucene.apache.org
Subject: [jira] Created: (LUCENE-529) TermInfosReader and other +
instance ThreadLocal => transient/odd memory leaks =>
OutOfMemoryException


TermInfosReader and other + instance ThreadLocal => transient/odd memory
leaks =>  OutOfMemoryException
------------------------------------------------------------------------
--------------------------------

         Key: LUCENE-529
         URL: http://issues.apache.org/jira/browse/LUCENE-529
     Project: Lucene - Java
        Type: Bug
  Components: Index
    Versions: 1.9
 Environment: Lucene 1.4.3 with 1.5.0_04 JVM or newer......will aplpy to
1.9 code
    Reporter: Andy Hind


TermInfosReader uses an instance level ThreadLocal for enumerators.
This is a transient/odd memory leak in lucene 1.4.3-1.9 and applies to
current JVMs,
not just an old JVM issue as described in the finalizer of the 1.9 code.

There is also an instance level thread local in SegmentReader....which
will have the same issue.
There may be other uses which also need to be fixed.

I don't understand the intended use for these variables.....however

Each ThreadLocal has its own hashcode used for look up, see the
ThreadLocal source code. Each instance of TermInfosReader will be
creating an instance of the thread local. All this does is create an
instance variable on each thread when it accesses the thread local.
Setting it to null in the finaliser will set it to null on one thread,
the finalizer thread, where it has never been created.  There is no
point to this :-(

I assume there is a good concurrency reason why an instance variable can
not be used...

I have not used multi-threaded searching, but I have used a lot of
threads each making searchers and searching.
1.4.3 has a clear memory leak caused by this thread local. This use case
above is definitely solved by setting the thread local to null in the
close(). This at least has a chance of being on the correct thread :-)
I know reusing Searchers would help but that is my choice and I will get
to that later ....

Now you wnat to know why....

Thread locals are stored in a table of entries. Each entry is *weak
reference* to the key (Here the TermInfosReader instance)  and a *simple
reference* to the thread local value. When the instance is GCed its key
becomes null.
This is now a stale entry in the table.
Stale entries are cleared up in an ad hoc way and until they are cleared
up the value will not be garbage collected.
Until the instance is GCed it is a valid key and its presence may cause
the table to expand.
See the ThreadLocal code.

So if you have lots of threads, all creating thread locals rapidly, you
can get each thread holding a large table of thread locals which all
contain many stale entries and preventing some objects from being
garbage collected.
The limited GC of the thread local table is not enough to save you from
running out of memory.

Summary:
========
- remove finalizer()
- set the thread local to null in close()
  - values will be available for gc

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message