lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (LUCENE-6849) Add IndexWriter API to write segment(s) without refreshing them
Date Thu, 05 Nov 2015 10:59:27 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14991515#comment-14991515
] 

Shai Erera edited comment on LUCENE-6849 at 11/5/15 10:58 AM:
--------------------------------------------------------------

LGTM. And +1 on making both flush public API. It's an expert API and I believe users who intend
to call {{flush()}} can also understand the implications of calling {{flush(true, true)}}.
Later we can consider consolidating and enhance this with a {{flush(FlushOptions)}} method,
where {{FlushOptions}} lets you specify whether you want to merge, applyDeletes, segment size
flush threshold etc.

A few comments:

* If you make the second flush() public
** I think we should document in {{flush()}} when you should use the second one?
** We should add a testFlushNoCommitButMergeAndApplyDeletes?
** Add the second flush() variant to RandomIndexWriter?
* In {{RandomIndexWriter.maybeFlushOrCommit}}, should we also sometimes randomly apply deletes
and trigger merges?



was (Author: shaie):
LGTM. And +1 on making both flush public API. It's an expert API and I believe users who intend
to call {{flush()}} can also understand the implications of calling {{flush(true, true)}}.
Later we can consider consolidating and enhance this with a {{flush(FlushOptions)}} method,
where {{FlushOptions}} lets you specify whether you want to merge, applyDeletes, segment size
flush threshold etc.

A few comments:

* If you make the second flush() public
** I think we should document in ({{flush()}}) when you should use the second one?
** We should add a testFlushNoCommitButMergeAndApplyDeletes?
** Do you want to also add the second flush() variant to RandomIndexWriter?
* In {{RandomIndexWriter.maybeFlushOrCommit}}, should we also sometimes randomly apply deletes
and trigger merges?


> Add IndexWriter API to write segment(s) without refreshing them
> ---------------------------------------------------------------
>
>                 Key: LUCENE-6849
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6849
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: Trunk, 5.4
>
>         Attachments: LUCENE-6849.patch
>
>
> Today, the only way to have {{IndexWriter}} free up some heap is to invoke refresh or
flush or close it, but these are all quite costly, and do much more than simply "move bytes
to disk".
> I think we should add a simple API, e.g. "move the biggest in-memory segment to disk"
to 1) give more granularity (there could be multiple in-memory segments), and 2) only move
bytes to disk (not refresh, not fsync, etc.).
> This way apps that want to be more careful on how heap is used can have more control.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message