lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Earwin Burrfoot (JIRA)" <>
Subject [jira] Commented: (LUCENE-2328) IndexWriter.synced field accumulates data leading to a Memory Leak
Date Thu, 18 Mar 2010 15:24:27 GMT


Earwin Burrfoot commented on LUCENE-2328:

> Keeping track of not-yet-sync'd files instead of sync'd files is better, but it still
requires upkeep (ie when file is deleted you have to remove it) because files can be opened,
written to, closed, deleted without ever being sync'd.
You can just skip this and handle FileNotFound exception when syncing. Have to handle it anyway,
no guarantees some file won't be snatched from under your nose.

> This will over-sync in some situations.
Don't feel this is a serious problem. If you over-sync (in fact sync some files a little bit
earlier than strictly required), in a few seconds you will under-sync, so total time is still
the same.

But I feel you're somewhat missing the point. System-wide sync is not the original aim, it's
just a possible byproduct of what is the original aim - to move sync tracking code from IW
to Directory. And I don't see at all how adding batch-syncs achieves this.
If you're calling sync(Collection<String>), damn, you should keep that collection somewhere
:) and it is supposed to be inside!

> IndexWriter.synced  field accumulates data leading to a Memory Leak
> -------------------------------------------------------------------
>                 Key: LUCENE-2328
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.9.1, 2.9.2, 3.0, 3.0.1
>         Environment: all
>            Reporter: Gregor Kaczor
>            Priority: Minor
>             Fix For: 3.1
>   Original Estimate: 1h
>  Remaining Estimate: 1h
> I am running into a strange OutOfMemoryError. My small test application does
> index and delete some few files. This is repeated for 60k times. Optimization
> is run from every 2k times a file is indexed. Index size is 50KB. I did analyze
> the HeapDumpFile and realized that IndexWriter.synced field occupied more than
> half of the heap. That field is a private HashSet without a getter. Its task is
> to hold files which have been synced already.
> There are two calls to addAll and one call to add on synced but no remove or
> clear throughout the lifecycle of the IndexWriter instance.
> According to the Eclipse Memory Analyzer synced contains 32618 entries which
> look like file names "_e065_1.del" or "_e067.cfs"
> The index directory contains 10 files only.
> I guess synced is holding obsolete data 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message