lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Khawaja Shams" <>
Subject Re: Which is faster/better
Date Tue, 25 Nov 2008 17:59:34 GMT
On Tue, Nov 25, 2008 at 8:42 AM, Grant Ingersoll <>wrote:

> On Nov 25, 2008, at 10:46 AM, Michael McCandless wrote:
>  If you already have the docId, would you need to/want to do
>>> delete-by-Query or even delete-by-Term?  Isn't delete-by-id a lot lighter
>>> weight since it only marks the the doc as deleted, where as d-b-Q can
>>> potentially force a flush, etc?
>> I guess the question is how you got that docID in the first place?  If
>> you got it by running a query, and deleting all docIDs that are
>> returned, then you could dBQ instead?
> User does a search.  Gets back a set of docs.  Picks docs to delete,
> deletes them.

Grant, can we assume that the document id will remain consistent from the
time user obtained the result and when they click delete? I was under the
impression that the document ids can change on optimize, etc.

>> Lucene's (IndexWriter's) dbQ doesn't force a flush: it's buffered just
>> like other deletes and then applied in bulk at certain times.  When
>> autoCommit is false, currently the deletes are applied when a
>> merge wants to start (ie not at each segment flush).  Or, if you call
>> commit().
> I was just going based of the code of the two:  In the IndexReader, all
> it's doing is marking a bit in a bit vector, right?  Whereas in the
> IndexWriter, it's checking if it's a time to flush, etc.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message