lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <>
Subject RE: WeakIdentityMap high memory usage
Date Wed, 07 Aug 2013 17:31:04 GMT
Hi Denis,

I assume you are using Lucene 3.6.0, because in Lucene 3.6.1 the tracking of buffers using
weak references is also done (although you cannot switch it off, unfortunately).

I can confirm what Mike says: Its all weak references and the overhead is maybe large, but
it gets freed when memory gets low. In general its in most cases better to not allocate too
much heap space for Lucene as this makes those maps larger and GC gets stressed. Only use
as much memory so no OOM occurs and instead free al memory for the file system cache (so it
has less paging). In that case, GC will clean up the concurrent maps faster.

In gernal: If you have an large index that changes seldom, but your query rate is very hight
(like 200 queries per second), switch unmapping off (works since Lucene 4.2, see changelog
for LUCENE-4740 - unfortunately the issue itself was closed for 4.4, 4.2 would be correct).
In that case it's not needed to take care of unmapping and as index reopen rate is low, this
does not waste resources.

But if your index changes often, there is no way around unmapping - or use NIOFSDir with NRTCachingDirectory
for the optimization of near real time search with highly changing indexes!

Finally: The only way to fix this would be to make all codec structures like TermsEnum or
DocsEnum, but also Scorer/DocIdSet/... implement Closeable. When you are done with Scorer
you have to close it and the underlying cloned indexinput would be closed, too. In that case,
the cloned IndexInput would be refcounted and unmapped when the last clone is closed. This
is a larger change and might be an idea for Lucene 5.0 as "optimization". It would be a backwards
break because all codecs and all queries would need to close correctly, but with our test
frameworak and MockDirWrapper (and other MockFooBarWrappers) we could track this so all resources
are closed.
We had TermEnum.close() up to Lucene 3.x, but it was dropped in 4.0 because it was never working
in 3.x (nobody ever called close() on TermEnum or TermDocs instances.... :( ). With our new
test framework this could be tracked now... So maybe worth a try?


Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen

> -----Original Message-----
> From: Michael McCandless []
> Sent: Wednesday, August 07, 2013 3:45 PM
> To: Lucene Users
> Subject: Re: WeakIdentityMap high memory usage
> This map is used to track all cloned open files, which can be a very large
> number over time (each search will create maybe 3 of them).
> This is done as a "best effort" to prevent SEGV (JVM dies) if you accidentally
> try to use an IndexReader after it was closed, while using MMapDirectory.
> However, it's a weak map, which means when HEAP is tight GC should drop
> it.
> So, this should not cause a real problem in "real life", even though it looks
> scary when you look at its RAM usage under a profiler.
> If somehow it's causing "real life" problems, please report back!  But a simple
> workaround is to call MMapDirectory.setUseUnmap(false) to turn off this
> tracking; this means you rely on GC to (eventually) unmap.
> Mike McCandless
> On Wed, Aug 7, 2013 at 2:45 AM, Denis Bazhenov <>
> wrote:
> > We have upgraded from Lucene 3.6 to 4.4.On the production we faced high
> minor GC time. Heap dump showed that one of the biggest objects by size is
> org.apache.lucene.util.WeakIdentityMap$IdentityWeakReference. About 11
> million instances with about 377 megabytes of memory in total (this is not
> even retained size). Here is screenshot of the JProfiler output:
> 08-07%20at%205.35.22%20PM.png.
> >
> > The keys of the map are MMapIndexInput. What this map is for and how
> can I reduce it memory usage?
> > ---
> > Denis Bazhenov <>
> > FarPost.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> > For additional commands, e-mail:
> >
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message