lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Smith (JIRA)" <>
Subject [jira] [Commented] (LUCENE-4398) "Memory Leak" in TermsHashPerField memory tracking
Date Mon, 17 Sep 2012 19:23:08 GMT


Tim Smith commented on LUCENE-4398:

NOTE: 16 bytes of "unaccounted" space in the postingsHash is actually much less than the object
header fields require for TermsHashPerField

so, i would argue that not accounting for this 16 bytes is a valid low-profile fix to this

the only gotcha would be if trimFields() is ever called on a TermsHashPerField that has not
been shrunk down to size due to a flush
is this possible?
even if possible, i expect this only occurs in the rare case of deep-down exceptions?
in that case, if abort() is called, i suppose the abort() method can be updated to shrink
down the hash as well (if this is safe to do)

> "Memory Leak" in TermsHashPerField memory tracking
> --------------------------------------------------
>                 Key: LUCENE-4398
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: 3.4
>            Reporter: Tim Smith
> I am witnessing an apparent leak in the memory tracking used to determine when a flush
is necessary.
> Over time, this will result in every single document being flushed into its own segment
as the memUsage will remain above the configured buffer size, causing a flush to be triggered
after every add/update.
> Best I can figure, this is being caused by TermsHashPerField's tracking of memory usage
for postingsHash and/or postingsArray combined with multi-threaded feeding.
> I suspect that the TermsHashPerField's postingsHash is growing in one thread, then, when
a segment is flushed, a single, different thread will merge all TermsHashPerFields in FreqProxTermsWriter
and then call shrinkHash(). I suspect this call of shrinkHash() is seeing an old postingsHash
array, and subsequently not releasing all the memory that was allocated.
> If this is the case, I am also concerned that FreqProxTermsWriter will not write the
correct terms into the index, although I have not confirmed that any indexing problem occurs
as of yet.
> NOTE: i am witnessing this growth in a test by subtracting the amount or memory allocated
(but in a "free" state) by perDocAllocator/byteBlockAllocator/charBlocks/intBlocks from DocumentsWriter.memUsage.get()
in IndexWriter.doAfterFlush()
> I will see this stay at a stable point for a while, then on some flushes, i will see
this grow by a couple of bytes, and all subsequent flushes will never go back down the the
previous state
> I will continue to investigate and post any additional findings

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message