lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Smith (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-2283) Possible Memory Leak in StoredFieldsWriter
Date Wed, 24 Feb 2010 15:00:28 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12837821#action_12837821
] 

Tim Smith commented on LUCENE-2283:
-----------------------------------

ramBufferSizeMB is 64MB

Here's the yourkit breakdown per class:
* DocumentsWriter - 256 MB
** TermsHash - 38.7 MB
** StoredFieldsWriter - 37.5 MB
** DocumentsWriterThreadState - 36.2 MB
** DocumentsWriterThreadState - 34.6 MB
** DocumentsWriterThreadState - 33.8 MB
** DocumentsWriterThreadState - 27.5 MB
** DocumentsWriterThreadState - 13.4 MB

I'm starting to dig into the ThreadStates now to see if anything stands out here

bq. Hmm, that makes me nervous, because I think in this case the use should be bounded.

I should be getting a new profile dump at "crash" time soon, so hopefully that will make things
clearer

bq. That doesn't sound good! Can you post some details on this (eg an exception)?

If i recall correctly, I think the exception was caused by an out of disk space situation
(which would recover)
obviously not much that can be done about this other than adding more disk space, however
the situation would recover, but docs would be lost in the interum

bq. But, anyway, keeping the same IW open and just calling commit is (should be) fine.

Yeah, this should be the way to go, especially as it results in the pooled buffers not needing
to be reallocated/reclaimed/etc, however right now this is the only change i can currently
think of that could result in memory issues.

bq. Yes, that's a great solution - a single pool. But that's a somewhat bigger change. 

Seems like this would be the best approach as it makes the memory bounded by the configuration
of the engine, giving better reuse of byte blocks and better ability to reclaim memory (in
DocumentsWriter.balanceRAM())




> Possible Memory Leak in StoredFieldsWriter
> ------------------------------------------
>
>                 Key: LUCENE-2283
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2283
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 2.4.1
>            Reporter: Tim Smith
>            Assignee: Michael McCandless
>             Fix For: 3.1
>
>
> StoredFieldsWriter creates a pool of PerDoc instances
> this pool will grow but never be reclaimed by any mechanism
> furthermore, each PerDoc instance contains a RAMFile.
> this RAMFile will also never be truncated (and will only ever grow) (as far as i can
tell)
> When feeding documents with large number of stored fields (or one large dominating stored
field) this can result in memory being consumed in the RAMFile but never reclaimed. Eventually,
each pooled PerDoc could grow very large, even if large documents are rare.
> Seems like there should be some attempt to reclaim memory from the PerDoc[] instance
pool (or otherwise limit the size of RAMFiles that are cached) etc

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message