lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Hill <solr-l...@worldware.com>
Subject Re: heap memory issues when sorting by a string field
Date Tue, 08 Dec 2009 00:57:25 GMT
Hey, that's a nice little Class! I hadn't see it before. But it sounds like
the asynchronous cleanup might deal with the problem I mentioned above (but
I haven't looked at the code yet).

It's an apache license - but you mentioned something about no third party
libraries. Is that a policy for Lucene?

Thanks,

Tom


On Mon, Dec 7, 2009 at 4:44 PM, Jason Rutherglen <jason.rutherglen@gmail.com
> wrote:

> I wonder if Google Collections (even though we don't use third party
> libraries) concurrent map, which supports weak keys, handles the
> removal of weakly referenced keys in a more elegant way than Java's
> WeakHashMap?
>
> On Mon, Dec 7, 2009 at 4:38 PM, Tom Hill <solr-list@worldware.com> wrote:
> > Hi -
> >
> > If I understand correctly, WeakHashMap does not free the memory for the
> > value (cached data) when the key is nulled, or even when the key is
> garbage
> > collected.
> >
> > It requires one more step: a method on WeakHashMap must be called to
> allow
> > it to release its hard reference to the cached data. It appears that most
> > methods in WeakHashMap end up calling expungeStaleEntries, which will
> clear
> > the hard reference. But you have to call some method on the map, before
> the
> > memory is eligible for garbage collection.
> >
> > So it requires four stages to free the cached data. Null the key; A GC to
> > release the weak reference to the key; A call to some method on the map;
> > Then the next GC cycle should free the value.
> >
> > So it seems possible that you could end up with double memory usage for a
> > time. If you don't have a GC between the time that you close the old
> reader,
> > and you start to load the field cache entry for the next reader, then the
> > key may still be hanging around uncollected.
> >
> > At that point, it may run a GC when you allocate the new cache, but
> that's
> > only the first GC. It can't free the cached data until after the next
> call
> > to expungeStaleEntries, so for a while you have both caches around.
> >
> > This extra usage could cause things to move into tenured space. Could
> this
> > be causing your problem?
> >
> > Workaround would be to cause some method to be called on the WeakHashMap.
> > You don't want to call get(), since that will try to populate the cache.
> > Maybe if you tried putting a small value to the cache, and doing a GC,
> and
> > see if your memory drops then.
> >
> >
> > Tom
> >
> >
> >
> > On Mon, Dec 7, 2009 at 1:48 PM, TCK <moonwatcher32329@gmail.com> wrote:
> >
> >> Thanks for the response. But I'm definitely calling close() on the old
> >> reader and opening a new one (not using reopen). Also, to simplify the
> >> analysis, I did my test with a single-threaded requester to eliminate
> any
> >> concurrency issues.
> >>
> >> I'm doing:
> >> sSearcher.getIndexReader().close();
> >> sSearcher.close(); // this actually seems to be a no-op
> >> IndexReader newIndexReader = IndexReader.open(newDirectory);
> >> sSearcher = new IndexSearcher(newIndexReader);
> >>
> >> Btw, isn't it bad practice anyway to have an unbounded cache? Are there
> any
> >> plans to replace the HashMaps used for the innerCaches with an actual
> >> size-bounded cache with some eviction policy (perhaps EhCache or
> something)
> >> ?
> >>
> >> Thanks again,
> >> TCK
> >>
> >>
> >>
> >>
> >> On Mon, Dec 7, 2009 at 4:37 PM, Erick Erickson <erickerickson@gmail.com
> >> >wrote:
> >>
> >> > What this sounds like is that you're not really closing your
> >> > readers even though you think you are. Sorting indeed uses up
> >> > significant memory when it populates internal caches and keeps
> >> > it around for later use (which is one of the reasons that warming
> >> > queries matter). But if you really do close the reader, I'm pretty
> >> > sure the memory should be GC-able.
> >> >
> >> > One thing that trips people up is IndexReader.reopen(). If it
> >> > returns a reader different than the original, you *must* close the
> >> > old one. If you don't, the old reader is still hanging around and
> >> > memory won't be returne.... An example from the Javadocs...
> >> >
> >> >  IndexReader reader = ...
> >> >  ...
> >> >  IndexReader new = r.reopen();
> >> >  if (new != reader) {
> >> >   ...     // reader was reopened
> >> >   reader.close();
> >> >  }
> >> >  reader = new;
> >> >  ...
> >> >
> >> >
> >> > If this is irrelevant, could you post your close/open
> >> >
> >> > code?
> >> >
> >> > HTH
> >> >
> >> > Erick
> >> >
> >> >
> >> > On Mon, Dec 7, 2009 at 4:27 PM, TCK <moonwatcher32329@gmail.com>
> wrote:
> >> >
> >> > > Hi,
> >> > > I'm having heap memory issues when I do lucene queries involving
> >> sorting
> >> > by
> >> > > a string field. Such queries seem to load a lot of data in to the
> heap.
> >> > > Moreover lucene seems to hold on to references to this data even
> after
> >> > the
> >> > > index reader has been closed and a full GC has been run. Some of the
> >> > > consequences of this are that in my generational heap configuration
> a
> >> lot
> >> > > of
> >> > > memory gets promoted to tenured space each time I close the old
> index
> >> > > reader
> >> > > and after opening and querying using a new one, and the tenured
> space
> >> > > eventually gets fragmented causing a lot of promotion failures
> >> resulting
> >> > in
> >> > > jvm hangs while the jvm does stop-the-world GCs.
> >> > >
> >> > > Does anyone know any workarounds to avoid these memory issues when
> >> doing
> >> > > such lucene queries?
> >> > >
> >> > > My profiling showed that even after a full GC lucene is holding on
> to a
> >> > lot
> >> > > of references to field value data notably via the
> >> > > FieldCacheImpl/ExtendedFieldCacheImpl. I noticed that the
> WeakHashMap
> >> > > readerCaches are using unbounded HashMaps as the innerCaches and I
> used
> >> > > reflection to replace these innerCaches with dummy empty HashMaps,
> but
> >> > > still
> >> > > I'm seeing the same behavior. I wondered if anyone has gone through
> >> these
> >> > > same issues before and would offer any advice.
> >> > >
> >> > > Thanks a lot,
> >> > > TCK
> >> > >
> >> >
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message