lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Li (JIRA)" <>
Subject [jira] Commented: (LUCENE-1194) Add deleteByQuery to IndexWriter
Date Wed, 27 Feb 2008 16:23:52 GMT


Ning Li commented on LUCENE-1194:

> As of LUCENE-1044, when autoCommit=true, IndexWriter only commits on
> committing a merge, not with every flush.

I see. Interesting.

> Hmmm ... but, there is actually the reverse problem now with my patch:
> an auto commit can actually commit deletes before the corresponding
> added docs are committed (from updateDocument calls). This is
> because, when we commit we only sync & commit the merged segments (not
> the flushed segments).


> Though, autoCommit=true is deprecated; once we
> remove that (in 3.0) this problem goes away. I'll have to ponder how
> to fix that for now up until's tricky. Maybe before 3.0
> we'll just have to flush all deletes whenever we flush a new
> segment....

I think flushing deletes when we flush a new segment is fine before 3.0.
In 3.0, is the plan to default autoCommit to false or to disable autoCommit
entirely? The latter, right?

> Also, I don't think we need updateByQuery? Eg in 3.0 when autoCommit
> is hardwired to false then you can deleteDocuments(Query) and then
> addDocument(...) and it will be atomic.

Agree. When autoCommit is disabled, we don't need any update method.

> Add deleteByQuery to IndexWriter
> --------------------------------
>                 Key: LUCENE-1194
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.4
>         Attachments: LUCENE-1194.patch
> This has been discussed several times recently:
> If we add deleteByQuery to IndexWriter then this is a big step towards
> allowing IndexReader to be readonly.
> I took the approach suggested in that first thread: I buffer delete
> queries just like we now buffer delete terms, holding the max docID
> that the delete should apply to.
> Then, I also decoupled flushing deletes (mapping term or query -->
> actual docIDs that need deleting) from flushing added documents, and
> now I flush deletes only when a merge is started, or on commit() or
> close().  SegmentMerger now exports the docID map it used when
> merging, and I use that to renumber the max docIDs of all pending
> deletes.
> Finally, I turned off tracking of memory usage of pending deletes
> since they now live beyond each flush.  Deletes are now only
> explicitly flushed if you set maxBufferedDeleteTerms to something
> other than DISABLE_AUTO_FLUSH.  Otherwise they are flushed at the
> start of every merge.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message