lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Memory Eaten up by TermInfo Instances in Lucene 2.4
Date Tue, 10 Feb 2009 14:52:45 GMT
Your index has relatively few terms: ~13 million.

Lucene stores TermInfo instances in two places.  The first place is a
persistent array, called the terms index, of every 128th term.  It's
created when the IndexReader is first opened.  So in your case this is
~100.000 ("100 thousand") instances.  The JPEG on your original post
on the compass forum is tracing to this terms index.

Then, per thread per segment there is an LRU cache of recently used
TermInfo instances.  That cache is 1024 in size.  Depending on how
many threads, segments and how many unique queries you are testing
with, this will add some number of TermInfo instances.

So... seeing alot of TermInfo instances is fully normal.

Are you actually hitting OOM, or just noticing alot of TermInfo
instances in YourKit?

How many SegmentReader instances do you see held open in YourKit?

Mike

chanchitodata wrote:

>
> Hi Michael,
>
> I´m pretty sure that the IndexReaders are being closed. As I said I  
> use
> Compass and compass handles all the IndexReader stuff for me. I have
> discussed this issue with Shay Banon for a while in the Compass   
> forum and
> he was the guy that lead me to this forum after several diferents  
> test we
> did.
>
> I have attached a textfile with the output of a CheckIndex on the  
> biggest
> index I have. I have also attached an image with the Luke overview  
> of the
> same index.
>
> Best regards,
> /Rodrigo
>
> http://www.nabble.com/file/p21932951/Checkindex.txt Checkindex.txt
> http://www.nabble.com/file/p21932951/LukeOverview.JPG LukeOverview.JPG
>
> Michael McCandless-2 wrote:
>>
>>
>> Are you certain that old IndexReaders are being closed?
>>
>> If you are not using CFS file format, how large are your *.tii files?
>> If you are using CFS file format, can you run CheckIndex on your  
>> index
>> and post the output?  This way we can see how many terms are in the
>> index (which is what gets loaded as TermInfo instances).
>>
>> Mike
>>
>> chanchitodata wrote:
>>
>>>
>>> Hi,
>>>
>>> I have a weird problem. I use Lucene 2.4 in an web  
>>> application(Tomcat
>>> 5.5.x), running uncer JDK 1.5. After a while (from 1 day to a couple
>>> depending on traffic) all memory gets eaten up by a lot of TermInfo
>>> instances. I have profiled the application and I can see that the
>>> TermInfo
>>> instances does not get recovered by the GC.
>>> I also use Compass and have been posting on a thread in the Compass
>>> forum(http://forum.compass-project.org/thread.jspa?threadID=215943&start=0&tstart=0
>>> ),
>>> thinking that it was Compass the held on to something in memory but
>>> we have
>>> come into conclusion that it must be Lucene that keeps holding on
>>> the the
>>> memory.
>>>
>>> I have read the thread
>>> http://www.nabble.com/OutOfMemory-Problems-Lucene-2.4---Tomcat-td20236834.html#a20236834
>>> and tried to divide my index into smaller parts but with the same
>>> result.
>>>
>>> The index contains about 1.5Gb with around 2.7 Million documents and
>>> aprox.
>>> 30 Fields.
>>>
>>> /Rodrigo
>>> -- 
>>> View this message in context:
>>> http://www.nabble.com/Memory-Eaten-up-by-TermInfo-Instances-in-Lucene-2.4-tp21913262p21913262.html
>>> Sent from the Lucene - Java Users mailing list archive at  
>>> Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>
> -- 
> View this message in context: http://www.nabble.com/Memory-Eaten-up-by-TermInfo-Instances-in-Lucene-2.4-tp21913262p21932951.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message