lucene-lucene-net-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oliver Donald" <oliver.don...@SnowValley.com>
Subject RE: Lucene memory usage
Date Wed, 01 Nov 2006 12:41:25 GMT
Hi George,

I'm not sure yet when the leak happens, sorry.

I'm using .NET 1, and the memory profiler was MemProfiler from
http://memprofiler.com/. 

I will try and get some sample code for you, but I doubt I'll have time
today! Tight deadlines :(

Cheers,
Oli

-----Original Message-----
From: George Aroush [mailto:george@aroush.net] 
Sent: 01 November 2006 12:31
To: lucene-net-user@incubator.apache.org
Subject: RE: Lucene memory usage

Hi Oli,

Whenever possible, it's best to have one instance of the reader open at
all
times -- opening/closing a reader is speed killer.

Are you saying you were seeing memory leak when you open/close the
reader?
If that's the case, please do investigate this further for us, or maybe
post
a sample code that shows the problem.

Also, can you tell us what memory profiler are you using and if you are
using .NET 1.1 or .NET 2.0.  With .NET 1.0 opening/closing a reader does
show some memory growth, but not with 2.0 (in my test.)

Regards,

-- George Aroush 

-----Original Message-----
From: Oliver Donald [mailto:oliver.donald@SnowValley.com] 
Sent: Wednesday, November 01, 2006 7:16 AM
To: lucene-net-user@incubator.apache.org
Subject: RE: Lucene memory usage

Hi,

I've now got our wrapper using a single instance, and the memory usage
is
flat! Its also quite a bit quicker now Lucene has a chance to warm up
its
cache a bit more.

It still seems odd that the memory usage went up opening and closing the
IndexSearcher though, I isolated the code so that there was practically
nothing referencing the results or the searcher. Using the memory
profiler I
could follow all the object references, and many of the Lucene resources
that where hanging around could have all of their references traced back
to
root without any referencing my code, but still the memory usage crept
up,
albeit slowly.

I'm quite busy right now but I'll try and isolate some example code if
you
are interested.

Thanks a lot though guys, was very helpful!

Thanks,
Oli

-----Original Message-----
From: George Aroush [mailto:george@aroush.net]
Sent: 01 November 2006 04:52
To: lucene-net-user@incubator.apache.org
Subject: RE: Lucene memory usage

Hi Oli,

If you can post some example code, it can go a long way for someone here
to
see what's going on and offer some help.

I have written webserver application as well as ASP.NET application
where
Lucene.Net keeps track of 3 indexes.  Memory usages has been steady.

>>From this email thread, like Andy said, my guess is that you are not
closing
a searcher, or opening a new searcher over and over.  Try this simple
test,
make your searcher static and global.  Don't forgot to do the same to
the
analyzer.

Regards,

-- George Aroush

-----Original Message-----
From: Oliver Donald [mailto:oliver.donald@SnowValley.com]
Sent: Tuesday, October 31, 2006 12:11 PM
To: lucene-net-user@incubator.apache.org
Subject: RE: Lucene memory usage

I've begun sharing a single instance and the memory usage is much
better...its still creeping up, but at a slightly less scary 0-5megs per
search.

Anyways thanks everyone for your help!
Oli

-----Original Message-----
From: Murad James [mailto:murad@7northfield.com]
Sent: 31 October 2006 16:29
To: lucene-net-user@incubator.apache.org
Subject: RE: Lucene memory usage

GC.SuppressFinalize stops .net garbage collection (GC) calling a
~finalizer
(a default method used to clean up the object) as the Close() method
wants
GC to know that it has done everything that needs to be done.
 
It does seem like you have a common or garden problem with a reference
hanging around - this would show the behaviour you describe. By that, I
mean
that there is something holding a reference to the search strings,
perhaps
indirectly, that is preventing them being released. However, my
knowledge of
dot lucene is not deep enough to be certain that it is not something
within
dot lucene.

________________________________

From: Oliver Donald [mailto:oliver.donald@SnowValley.com]
Sent: Tue 31/10/2006 16:17
To: lucene-net-user@incubator.apache.org
Subject: RE: Lucene memory usage



Hi,

I've put the call in the page unload() event, as it seems safest. I
realize
its more efficient to re-use the Searcher, but I want to isolate the
leak
first.

Unfortunately memory usage is still going up at pretty much the same
rate;
its quite jumpy, so its hard to tell, but I still get the out of memory
exception :(

Looking in the memory profiler, the references are the same as before I
added the Close() call on the IndexSearcher.

I noticed that in the Close method for the IndexSearcher, there are
calls to
GC.SuppressFinalize(), what are they there for?

Thanks,
Oli



-----Original Message-----
From: Andy Berryman [mailto:andyb@channeladvisor.com]
Sent: 31 October 2006 15:25
To: lucene-net-user@incubator.apache.org
Subject: RE: Lucene memory usage

You'll most likely have to test different scenarios to determine the
best
location for this call.  But you'll have to wait till after you have
gathered all the results that you want from the "Hits.Doc()" method
call. 

One thing that you'll also want to take into account is that you get
better
search performance for consequtive searches by keeping the Searcher
object
alive and reusing it.  But in some cases this isnt realistic.  One thing
to
take note of if you decide to do some sort of caching of the Searcher
object
is that it is only aware of documents that were in the index at the time
the
object was opened.

Andy

-----Original Message-----
From: Oliver Donald [mailto:oliver.donald@SnowValley.com]
Sent: Tuesday, October 31, 2006 10:12 AM
To: lucene-net-user@incubator.apache.org
Subject: RE: Lucene memory usage

Close() method? Yes, that might have something to do with it ;)

I never considered that because we've been using Lucene without an issue
for
quite a while, so I thought our wrapper was ok and it was something I
had
done or a Lucene quirk, but I guess nothing has stressed it quite as
badly
as my extensions...turns out our wrapper wasn't closing the searcher :(

When is the best time to close the searcher though? If I close it after
I
receive the Hits collection, I get a null reference exception from
FSDirectory.cs when I try and I call Hits.Doc(Int32 n).

This is for a webpage - is the best time to close the reader in the
Unload() event for the page?

Thanks,
Oli



-----Original Message-----
From: Murad James [mailto:murad@7northfield.com]
Sent: 31 October 2006 14:14
To: lucene-net-user@incubator.apache.org
Subject: RE: Lucene memory usage

Yes, this sounds like something still has a reference to those string
objects.

If that doesn't work, post us a code snippet!

________________________________

From: Andy Berryman [mailto:andyb@channeladvisor.com]
Sent: Tue 31/10/2006 14:12
To: lucene-net-user@incubator.apache.org
Subject: RE: Lucene memory usage



Are you executing the "Close()" method on all your objects?  I didn't
investigate the fix in depth, but my understanding was that the fix was
to
make another call to cleanup memory in the "Close()" method of the
Searcher/Reader object.

Andy

-----Original Message-----
From: Oliver Donald [mailto:oliver.donald@SnowValley.com]
Sent: Tuesday, October 31, 2006 9:04 AM
To: lucene-net-user@incubator.apache.org
Subject: RE: Lucene memory usage

Unfortunately that's the version I am using :( Thanks for the reply
though!

I've got 1.6 million strings, and then 250,000 Term's and 250,000
TermInfo's. Tracing most of these back, they are referenced by
Hashtable's
that appear in the allocation call stack at
IndexSearcher.Search() -> FieldSortedQueue.cctor(). So it looks like my
old
search results are hanging around in memory and stopping the associated
resources being garbage collected...

Does this sound likely? Is there an obvious remedy or am I doing
something
wrong? I'm using a single IndexSearcher, does this need to be explicitly
told to dispose of previous search results?

Thanks,
Oli


-----Original Message-----
From: Andy Berryman [mailto:andyb@channeladvisor.com]
Sent: 31 October 2006 12:40
To: lucene-net-user@incubator.apache.org
Subject: RE: Lucene memory usage

This sounds like a very similar problem that I was having with version
1.4.3.  A fix for this was made by George in version 1.9.1 build 4.

Andy

________________________________

From: Oliver Donald [mailto:oliver.donald@SnowValley.com]
Sent: Tue 10/31/2006 5:51 AM
To: lucene-net-user@incubator.apache.org
Subject: Lucene memory usage



Hi,

I'm relatively new to Lucene, but have been given a task of implementing
a
complicated search. What I've ended up with is a page with search
results
built from a number of queries (12 in total!) and as far as I can tell
there
is no way to reduce this to less queries.

My problem is that Lucene ends up using up loads of RAM, but none of
this
seems to get garbage collected. Each search adds another 10-30megs of
data,
with occasional spikes of 50+ megs. This memory does not get garbage
collected, and a few searches down the line I get an out of memory
exception
thrown.

So I ran a memory profiler on the app, and it turns out that 90%
(800,000+) of the memory is System.Strings, and many of these belong to
Lucene, often referenced by Term -> Term[] -> TermInfosReader ->
SegmentReader. What are these strings? Some kind of cache?

Is there anything I have to be careful of when performing groups of
searches
at once? Is there any manual memory management I should be doing? Is
there
anything I can do to reduce the amount of memory Lucene uses?

My queries are generally simple, no more then 8 OR clauses each,
although
they all have a range filter and sometimes a beginsWith filter applied
to
them. I've been careful to avoid wildcards and ranges in the main query
itself!

Any help greatly appreciated!
Cheers,
Oli                  








Mime
View raw message