lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruben Laguna <ruben.lag...@gmail.com>
Subject Re: java.lang.OutOfMemoryError: GC overhead limit exceeded
Date Sat, 10 Apr 2010 04:26:28 GMT
Take a memory snapshot with JConsole -> dumpHeap [1] and the analyze
it with Eclipse MAT [2]. Find the biggest objects and look at their
path to GC roots to see if lucene is actually retaining them. You may
also want to look to two recently closed bug reports about memory
leaks [3] and [4]

[1] http://java.sun.com/developer/technicalArticles/J2SE/monitoring/
[2] http://www.eclipse.org/mat/
[3] https://issues.apache.org/jira/browse/LUCENE-2387
[4] https://issues.apache.org/jira/browse/LUCENE-2384

On Sat, Apr 10, 2010 at 5:23 AM, Herbert Roitblat <herb@orcatec.com> wrote:
>
> Hi, folks.
> I am using PyLucene and doing a lot of get tokens.  lucene.py reports version 2.4.0.
 It is rpath linux with 8GB of memory.  Python is 2.4.
> I'm not sure what the maxheap is, I think that it is maxheap='2048m'.  I think that
it's running in a 64 bit environment.
> It indexes a set of 116,000 documents just fine.
> Then I need to get the tokens from these documents and near the end, I run into:
>
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>
> If I wait a bit and ask again for the same document's tokens, I can get them, but it
then is somewhat likely to post the same error on a certain number of other documents.  I
can handle these errors and ask again.
>
> I have read that this error message means that the heap is getting filled up and garbage
collection removes only a small amount of it.  Since all I am doing is retrieving, why should
the heap be filling up?  I restarted the system before starting the retrieval.
>
> My guess is that there is some small memory leak because memory assigned to my python
program grows slowly as I request more document tokens.  Since I'm not intending to change
anything in either my python program or in Lucene, any growth is unintentional.  I'm just
getting tokens.
>
>  we use lucene.TermQuery as the query object to get the terms.
>
> I cannot share the documents nor the application code, but I might be able to provide
snippets.
>
> One last piece of information, the time needed to retrieve documents slows throughout
the process.  In the beginning I was getting about 10 documents per second.  Towards the
end, it is down to about 5 with about 5 second pauses from time to time, perhaps due to garbage
collection?
>
> Any idea of why the heap is filling up and what I can do about it?
>
> Thanks,
> Herb
>
>



--
/Rubén

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message