lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <>
Subject [jira] Commented: (LUCENE-709) [PATCH] Enable application-level management of IndexWriter.ramDirectory size
Date Mon, 20 Nov 2006 23:36:05 GMT
    [ ] 
Yonik Seeley commented on LUCENE-709:

Sorry for not being clearer before Chuck, I actually did understand your point-in-time points.
I was just trying to point out that for the usecases I had in mind, the extra sync didn't
buy one much.  Perhaps you have different usecases in mind where you can take action based
on the size of a RAMDirectory without regard to what other modifiers are doing.

> Not quite, because the bug already exists in lucene in RAMDirectory.list().

I agree.  On the first quick pass I only commented on it's non thread-safe behavior because
I thought it was just a debugging method... looking at it again, I see it should be fixed.
 IIRC, I think Michael may have already fixed it in his lockless patch.

I'm +1 on your other changes such as converting to a HashMap (too bad we can't use ConcurrentHashMap
I'd also be OK with changing the buffers Vector to an ArrayList while our eyes are on this
part of the code.  That might be more cosmetic than anything else though.

Please continue with your patch if you would like.

> [PATCH] Enable application-level management of IndexWriter.ramDirectory size
> ----------------------------------------------------------------------------
>                 Key: LUCENE-709
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.0.1
>         Environment: All
>            Reporter: Chuck Williams
>         Attachments: ramdir.patch, ramdir.patch, ramDirSizeManagement.patch, ramDirSizeManagement.patch,
> IndexWriter currently only supports bounding of in the in-memory index cache using maxBufferedDocs,
which limits it to a fixed number of documents.  When document sizes vary substantially, especially
when documents cannot be truncated, this leads either to inefficiencies from a too-small value
or OutOfMemoryErrors from a too large value.
> This simple patch exposes IndexWriter.flushRamSegments(), and provides access to size
information about IndexWriter.ramDirectory so that an application can manage this based on
total number of bytes consumed by the in-memory cache, thereby allow a larger number of smaller
documents or a smaller number of larger documents.  This can lead to much better performance
while elimianting the possibility of OutOfMemoryErrors.
> The actual job of managing to a size constraint, or any other constraint, is left up
the applicatation.
> The addition of synchronized to flushRamSegments() is only for safety of an external
call.  It has no significant effect on internal calls since they all come from a sychronized

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message