lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruben Laguna <ruben.lag...@gmail.com>
Subject IndexWriter memory leak?
Date Wed, 07 Apr 2010 20:35:50 GMT
Hi,

It seems like my IndexWriter after commiting and optimizing has a retained
size of 140Mb. See [1] for a screenshot of the heapdump analysis done with
Eclipse MAT.

Of those 140MB 67MB are retained by
analyzer.tokenStreams.hardRefs.table.HashMap$Entry.value.tokenStream.scanner.zzBuffer


why is this? Is it a memory leak? or did I something wrong during the
indxing? (BTW, I'm indexing document which contains Fields(xxxx,Reader) and
those Reader are wrappers around Tika.parse(xxxx) Readers. I get a lot
IOExceptions from tika readers and the wrapper maps the exceptions to EOF so
Lucene doesn't see the exception).



...and 73MB of the 140MB are retained by docWriter see [2]. It looks like
the Field objects in the
array docWriter.threadStates[0].consumer.fieldHash[1].fields[xxxx] are
holding references to the Readers. Those reader instances are actually
closed after IndexWriter.updateDocument. Each one of those Readers retains
1MB. The question is why IndexWriter holds references to those Readers after
the Documents have been indexed.


[1] http://img.skitch.com/20100407-1183815yiausisg73u9wfgscsj.jpg
[2] http://img.skitch.com/20100407-b86irkp7e4uif2wq1dd4t899qb.jpg

-- 
/Rubén

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message