lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Which is faster/better
Date Tue, 25 Nov 2008 17:15:47 GMT

Grant Ingersoll wrote:

> On Nov 25, 2008, at 10:46 AM, Michael McCandless wrote:
>>> If you already have the docId, would you need to/want to do delete- 
>>> by-Query or even delete-by-Term?  Isn't delete-by-id a lot lighter  
>>> weight since it only marks the the doc as deleted, where as d-b-Q  
>>> can potentially force a flush, etc?
>> I guess the question is how you got that docID in the first place?   
>> If
>> you got it by running a query, and deleting all docIDs that are
>> returned, then you could dBQ instead?
> User does a search.  Gets back a set of docs.  Picks docs to delete,  
> deletes them.

User means end-user, eg via a UI?  Probably delete-by-term would  
suffice here?

If user means developer who wrote some interesting programmatic logic  
that iterates through the docs returned by a search and deletes  
certain ones, that could be implemented as a Filter, right?  I guess  
it's sort of a hassle now since IndexWriter doesn't have a delete-by- 
Filter (you'd have to wrap it in ConstantScoreQuery, which is sort of  

>> Lucene's (IndexWriter's) dbQ doesn't force a flush: it's buffered  
>> just
>> like other deletes and then applied in bulk at certain times.  When
>> autoCommit is false, currently the deletes are applied when a
>> merge wants to start (ie not at each segment flush).  Or, if you call
>> commit().
> I was just going based of the code of the two:  In the IndexReader,  
> all it's doing is marking a bit in a bit vector, right?  Whereas in  
> the IndexWriter, it's checking if it's a time to flush, etc.

Sure, IndexWriter has to manage other things (flushing new segments,  
merging, etc).

But the actual mechanics of deletion (marking bits in the BitVector)  
are actually the same because under the hood, when IndexWriter applies  
the deletes, it's asking a [private] SegmentReader to do so.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message