lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: lucene nicking my memory ?
Date Wed, 03 Dec 2008 13:45:03 GMT

Are you actually hitting OOME?

Or, you're watching heap usage and it bothers you that the GC is  
taking a long time (allowing too much garbage to use up heap space)  
before sweeping?

One thing to try (only for testing) might be a lower and lower -Xmx  
until you do hit OOME; then you'll know the "real" memory usage of the  
app.

Mike

Magnus Rundberget wrote:

> Sure,
>
> Tried with the following
> Java version: build 1.5.0_16-b06-284 (dev), 1.5.0_12 (production)
> OS : Mac OS/X Leopard(dev) and Windows XP(dev), Windows 2003  
> (production)
> Container : Jetty 6.1 and Tomcat 5.5 (latter is used both in dev and  
> production)
>
>
> current jvm options
> -Xms512m -Xmx1024M -XX:MaxPermSize=256m
> ... tried a few gc settings as well but nothing that has helped  
> (rather slowed things down)
>
> production hw running 2 XEON dual core processors
>
> in production our memory reaches the 1024 limit after a while (a few  
> hours) and at some point it stops responding to forced gc (using  
> jconsole).
>
> need to digg quite a bit more to figure out the exact prod settings.  
> But safe to say the memory usage pattern can be recreated on  
> different hardware configs, with different os's, different 1.5 jvms  
> and different containers (jetty and tomcat).
>
>
>
> cheers
> Magnus
>
>
>
> On 3. des.. 2008, at 13.10, Glen Newton wrote:
>
>> Hi Magnus,
>>
>> Could you post the OS, version, RAM size, swapsize, Java VM version,
>> hardware, #cores, VM command line parameters, etc? This can be very
>> relevant.
>>
>> Have you tried other garbage collectors and/or tuning as described in
>> http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html?
>>
>> 2008/12/3 Magnus Rundberget <rundis@mac.com>:
>>> Hi,
>>>
>>> We have an application using Tomcat, Spring etc and Lucene 2.4.0.
>>> Our index is about 100MB (in test) and has about 20 indexed fields.
>>>
>>> Performance is pretty good, but we are experiencing a very high  
>>> usage of
>>> memory when searching.
>>>
>>> Looking at JConsole during a somewhat silly scenario (but  
>>> illustrates the
>>> problem);
>>> (Allocated 512 MB Min heap space, max 1024)
>>>
>>> 0. Initially memory usage is about 70MB
>>> 1. Search for word "er", heap memory usage goes up by 100-150MB
>>> 1.1 Wait for 30 seconds... memory usage stays the same (ie no gc)
>>> 2. Search by word "og", heap memory usage goes up another 50-100MB
>>> 2.1 See 1.1
>>>
>>> ...and so on until it seems to reach the 512 MB limit, and then a  
>>> garbage
>>> collection is performed
>>> i.e garbage collection doesn't seem to occur until it "hits the  
>>> roof"
>>>
>>> We believe the scenario is similar in production, were our heap  
>>> space is
>>> limited to 1.5 GB.
>>>
>>>
>>> Our search is basically as follows
>>> ----------------------------------------------
>>> 1. Open an IndexSearcher
>>> 2. Build a Boolean Query searching across 4 fields (title,  
>>> summary, content
>>> and daterangestring YYYYMMDD)
>>> 2.1 Sort on title
>>> 3. Perform search
>>> 4. Iterate over hits to build a set of custom result objects  
>>> (pretty small,
>>> as we dont include content in these)
>>> 5. Close searcher
>>> 6. Return result objects.
>>
>> You should not close the searcher: it can be shared by all queries.
>> What happens when you warm Lucene with a (large) number of queries:  
>> do
>> things stabilize over time?
>>
>> A 100MB index is (relatively) very small for Lucene (I have indexes
>>> 100GB). What kind of response times are you getting, independent of
>> memory usage.
>>
>> -glen
>>
>>>
>>> We have tried various options based on entries on this mailing list;
>>> a) Cache the IndexSearcher - Same results
>>> b) Remove sorting - Same result
>>> c) In point 4 only iterating over a limited amount of hits rather  
>>> than whole
>>> collection - Same result in terms of memory usage, but obviously  
>>> increased
>>> performance
>>> d) Using RamDirectory vs FSDirectory - Same result only initial  
>>> heap usage
>>> is higher using ramdirectory (in conjuction with cached  
>>> indexsearcher)
>>>
>>>
>>> Doing some profiling using YourKit shows a huge number of char[],  
>>> int[] and
>>> string[], and ever increasing number of lucene related objects.
>>>
>>>
>>>
>>> Reading through the mailing lists, suspicions are that our problem  
>>> is
>>> related to ThreadLocals and memory not being released. Noticed  
>>> that there
>>> was a related patch for this in 2.4.0, but it doesn't seem to help  
>>> us much.
>>>
>>> Any ideas ?
>>>
>>> kind regards
>>> Magnus
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>>
>>
>> -- 
>>
>> -
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message