lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chanchitodata <chanchitod...@gmail.com>
Subject Re: Memory Eaten up by TermInfo Instances in Lucene 2.4
Date Tue, 10 Feb 2009 17:44:29 GMT

Hi Michael,

I actually dont hit OOM, The memory gets 100% full and the JVM hangs.
Independently what type of GC alogorithm I use. Have tried all sorts of JVM
GC flags.
Profiling the application with YourKit I can see that the TermInfo instances
does not get freed up when the GC is done.

The application starts with around 100.000 instances of TermInfo and then
its starts to accumulate around 1000 and 2000 instances per web request.
This is,if I understand you right, normal but shouldn't the allocated
TermInfo instances be returned after a GC? I never see the instances get
freed up.

The SegmentReader instances are actually just 3 arrays of SegmentReader
instances. 


Michael McCandless-2 wrote:
> 
> Your index has relatively few terms: ~13 million.
> 
> Lucene stores TermInfo instances in two places.  The first place is a
> persistent array, called the terms index, of every 128th term.  It's
> created when the IndexReader is first opened.  So in your case this is
> ~100.000 ("100 thousand") instances.  The JPEG on your original post
> on the compass forum is tracing to this terms index.
> 
> Then, per thread per segment there is an LRU cache of recently used
> TermInfo instances.  That cache is 1024 in size.  Depending on how
> many threads, segments and how many unique queries you are testing
> with, this will add some number of TermInfo instances.
> 
> So... seeing alot of TermInfo instances is fully normal.
> 
> Are you actually hitting OOM, or just noticing alot of TermInfo
> instances in YourKit?
> 
> How many SegmentReader instances do you see held open in YourKit?
> 
> Mike
> 
> chanchitodata wrote:
> 
>>
>> Hi Michael,
>>
>> I´m pretty sure that the IndexReaders are being closed. As I said I  
>> use
>> Compass and compass handles all the IndexReader stuff for me. I have
>> discussed this issue with Shay Banon for a while in the Compass   
>> forum and
>> he was the guy that lead me to this forum after several diferents  
>> test we
>> did.
>>
>> I have attached a textfile with the output of a CheckIndex on the  
>> biggest
>> index I have. I have also attached an image with the Luke overview  
>> of the
>> same index.
>>
>> Best regards,
>> /Rodrigo
>>
>> http://www.nabble.com/file/p21932951/Checkindex.txt Checkindex.txt
>> http://www.nabble.com/file/p21932951/LukeOverview.JPG LukeOverview.JPG
>>
>> Michael McCandless-2 wrote:
>>>
>>>
>>> Are you certain that old IndexReaders are being closed?
>>>
>>> If you are not using CFS file format, how large are your *.tii files?
>>> If you are using CFS file format, can you run CheckIndex on your  
>>> index
>>> and post the output?  This way we can see how many terms are in the
>>> index (which is what gets loaded as TermInfo instances).
>>>
>>> Mike
>>>
>>> chanchitodata wrote:
>>>
>>>>
>>>> Hi,
>>>>
>>>> I have a weird problem. I use Lucene 2.4 in an web  
>>>> application(Tomcat
>>>> 5.5.x), running uncer JDK 1.5. After a while (from 1 day to a couple
>>>> depending on traffic) all memory gets eaten up by a lot of TermInfo
>>>> instances. I have profiled the application and I can see that the
>>>> TermInfo
>>>> instances does not get recovered by the GC.
>>>> I also use Compass and have been posting on a thread in the Compass
>>>> forum(http://forum.compass-project.org/thread.jspa?threadID=215943&start=0&tstart=0
>>>> ),
>>>> thinking that it was Compass the held on to something in memory but
>>>> we have
>>>> come into conclusion that it must be Lucene that keeps holding on
>>>> the the
>>>> memory.
>>>>
>>>> I have read the thread
>>>> http://www.nabble.com/OutOfMemory-Problems-Lucene-2.4---Tomcat-td20236834.html#a20236834
>>>> and tried to divide my index into smaller parts but with the same
>>>> result.
>>>>
>>>> The index contains about 1.5Gb with around 2.7 Million documents and
>>>> aprox.
>>>> 30 Fields.
>>>>
>>>> /Rodrigo
>>>> -- 
>>>> View this message in context:
>>>> http://www.nabble.com/Memory-Eaten-up-by-TermInfo-Instances-in-Lucene-2.4-tp21913262p21913262.html
>>>> Sent from the Lucene - Java Users mailing list archive at  
>>>> Nabble.com.
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>>
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/Memory-Eaten-up-by-TermInfo-Instances-in-Lucene-2.4-tp21913262p21932951.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Memory-Eaten-up-by-TermInfo-Instances-in-Lucene-2.4-tp21913262p21938848.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message