lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Which is faster/better
Date Tue, 25 Nov 2008 15:46:45 GMT

Grant Ingersoll wrote:

>
> On Nov 25, 2008, at 7:53 AM, Michael McCandless wrote:
>
>>
>> As of 2.4, IndexWriter now provides delete-by-Query, which I think
>> ought to meet nearly all of the cases where someone wants to
>> delete-by-docID using IndexReader.
>>
>> Or are there situations out there where delete-by-docID is still
>> compelling?
>
>
> Assuming delete-by-DocId means IndexReader.deleteDocument(int)  
> right?  That is, you mean the internal Lucene doc id, right?

Right, I mean delete by internal docID.

> If you already have the docId, would you need to/want to do delete- 
> by-Query or even delete-by-Term?  Isn't delete-by-id a lot lighter  
> weight since it only marks the the doc as deleted, where as d-b-Q  
> can potentially force a flush, etc?

I guess the question is how you got that docID in the first place?  If
you got it by running a query, and deleting all docIDs that are
returned, then you could dBQ instead?

Lucene's (IndexWriter's) dbQ doesn't force a flush: it's buffered just
like other deletes and then applied in bulk at certain times.  When
autoCommit is false, currently the deletes are applied when a
merge wants to start (ie not at each segment flush).  Or, if you call
commit().

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message