lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Magnus Rundberget <run...@mac.com>
Subject Re: lucene nicking my memory ?
Date Thu, 04 Dec 2008 06:27:14 GMT
Well...

after various tests I downgraded to lucene 1.9.1 to see if that had  
any effect... doesn't seem that way.

I have set up a JMeter test with 5 concurrent users doing a search (a  
silly search for a two letter word) every 3 seconds (with a random of  
+/- 500ms).

- With 512 MB xms/xmx memory usage settles between 400/500 after a few  
iterations, but no OOME.
At the end of the run memory settles usually between 200-300 somewhere  
(really depends), but no cleanup occurs for minutes ...unless I do a  
forced GC.

- Did the same run with 384MB and hit 2 OOME

- Did the same run with 256MB and hit 5 or 6 OOME

Tried to run tomcat with jdk 1.6 and -server option as well, but  
didn't seem to help at all either.

The finally I ran the test scenario above but with 1536MB xms/xmx...  
and guess what. It used it all pretty quickly. It used between  
1000-1400/1500 for most of the run. At the end of the run memory usage  
settled at about 750 MB ... until I did a forced gc.
This do bother me, if the solution could have been to throw more  
memory at the problem I could live with that, but it just seems to  
consume all memory it can get it hands on (:-
Is there any way to limit the memory usage in Lucene (configuration) ?

Im obviously not sure if lucene is the culprit as Im using spring  
(2.5) , hibernate (3.3) and open session in view and lots of other  
stuff etc in my app. So I guess my next step would be to create a very  
limited web app with just a servlet calling the lucene api.Then do  
some profiling on that.

cheers
Magnus









On 3. des.. 2008, at 14.45, Michael McCandless wrote:

>
> Are you actually hitting OOME?
>
> Or, you're watching heap usage and it bothers you that the GC is  
> taking a long time (allowing too much garbage to use up heap space)  
> before sweeping?
>
> One thing to try (only for testing) might be a lower and lower -Xmx  
> until you do hit OOME; then you'll know the "real" memory usage of  
> the app.
>
> Mike
>
> Magnus Rundberget wrote:
>
>> Sure,
>>
>> Tried with the following
>> Java version: build 1.5.0_16-b06-284 (dev), 1.5.0_12 (production)
>> OS : Mac OS/X Leopard(dev) and Windows XP(dev), Windows 2003  
>> (production)
>> Container : Jetty 6.1 and Tomcat 5.5 (latter is used both in dev  
>> and production)
>>
>>
>> current jvm options
>> -Xms512m -Xmx1024M -XX:MaxPermSize=256m
>> ... tried a few gc settings as well but nothing that has helped  
>> (rather slowed things down)
>>
>> production hw running 2 XEON dual core processors
>>
>> in production our memory reaches the 1024 limit after a while (a  
>> few hours) and at some point it stops responding to forced gc  
>> (using jconsole).
>>
>> need to digg quite a bit more to figure out the exact prod  
>> settings. But safe to say the memory usage pattern can be recreated  
>> on different hardware configs, with different os's, different 1.5  
>> jvms and different containers (jetty and tomcat).
>>
>>
>>
>> cheers
>> Magnus
>>
>>
>>
>> On 3. des.. 2008, at 13.10, Glen Newton wrote:
>>
>>> Hi Magnus,
>>>
>>> Could you post the OS, version, RAM size, swapsize, Java VM version,
>>> hardware, #cores, VM command line parameters, etc? This can be very
>>> relevant.
>>>
>>> Have you tried other garbage collectors and/or tuning as described  
>>> in
>>> http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html?
>>>
>>> 2008/12/3 Magnus Rundberget <rundis@mac.com>:
>>>> Hi,
>>>>
>>>> We have an application using Tomcat, Spring etc and Lucene 2.4.0.
>>>> Our index is about 100MB (in test) and has about 20 indexed fields.
>>>>
>>>> Performance is pretty good, but we are experiencing a very high  
>>>> usage of
>>>> memory when searching.
>>>>
>>>> Looking at JConsole during a somewhat silly scenario (but  
>>>> illustrates the
>>>> problem);
>>>> (Allocated 512 MB Min heap space, max 1024)
>>>>
>>>> 0. Initially memory usage is about 70MB
>>>> 1. Search for word "er", heap memory usage goes up by 100-150MB
>>>> 1.1 Wait for 30 seconds... memory usage stays the same (ie no gc)
>>>> 2. Search by word "og", heap memory usage goes up another 50-100MB
>>>> 2.1 See 1.1
>>>>
>>>> ...and so on until it seems to reach the 512 MB limit, and then a  
>>>> garbage
>>>> collection is performed
>>>> i.e garbage collection doesn't seem to occur until it "hits the  
>>>> roof"
>>>>
>>>> We believe the scenario is similar in production, were our heap  
>>>> space is
>>>> limited to 1.5 GB.
>>>>
>>>>
>>>> Our search is basically as follows
>>>> ----------------------------------------------
>>>> 1. Open an IndexSearcher
>>>> 2. Build a Boolean Query searching across 4 fields (title,  
>>>> summary, content
>>>> and daterangestring YYYYMMDD)
>>>> 2.1 Sort on title
>>>> 3. Perform search
>>>> 4. Iterate over hits to build a set of custom result objects  
>>>> (pretty small,
>>>> as we dont include content in these)
>>>> 5. Close searcher
>>>> 6. Return result objects.
>>>
>>> You should not close the searcher: it can be shared by all queries.
>>> What happens when you warm Lucene with a (large) number of  
>>> queries: do
>>> things stabilize over time?
>>>
>>> A 100MB index is (relatively) very small for Lucene (I have indexes
>>>> 100GB). What kind of response times are you getting, independent of
>>> memory usage.
>>>
>>> -glen
>>>
>>>>
>>>> We have tried various options based on entries on this mailing  
>>>> list;
>>>> a) Cache the IndexSearcher - Same results
>>>> b) Remove sorting - Same result
>>>> c) In point 4 only iterating over a limited amount of hits rather  
>>>> than whole
>>>> collection - Same result in terms of memory usage, but obviously  
>>>> increased
>>>> performance
>>>> d) Using RamDirectory vs FSDirectory - Same result only initial  
>>>> heap usage
>>>> is higher using ramdirectory (in conjuction with cached  
>>>> indexsearcher)
>>>>
>>>>
>>>> Doing some profiling using YourKit shows a huge number of char[],  
>>>> int[] and
>>>> string[], and ever increasing number of lucene related objects.
>>>>
>>>>
>>>>
>>>> Reading through the mailing lists, suspicions are that our  
>>>> problem is
>>>> related to ThreadLocals and memory not being released. Noticed  
>>>> that there
>>>> was a related patch for this in 2.4.0, but it doesn't seem to  
>>>> help us much.
>>>>
>>>> Any ideas ?
>>>>
>>>> kind regards
>>>> Magnus
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>
>>>
>>>
>>> -- 
>>>
>>> -
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message