lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Deleting a document with an IndexWriter open
Date Tue, 20 Jul 2004 03:44:26 GMT
Dmitry Serebrennikov wrote:
> Doug Cutting wrote:
> 
>> Dmitry Serebrennikov wrote:
>>
>>> So here's a modified sequence of operations, perhaps a bit more 
>>> efficient than proposed by Christoph:
>>> 1) Open an IndexReader for searching - S. Keep it open until the 
>>> transaction is committed.
>>> 2) Open a second IndexReader for deletions - D.
>>> 3) Create a filter bitset F (or use any other mechanism for storing 
>>> document numbers to be deleted)
>>> 4) Open an IndexWriter for new documents - W.
>>> 5) As documents come in, add them using W. Find their old versions in 
>>> D and record their document numbers in F. D will not show any new 
>>> documents, only documents present at the time D was created.
>>> 6) Close W.
>>> 7) Use D to delete all documents marked in F.
>>> 8) Close D.
>>
>>
>>
>> What happens if there are deletions in S and D, and then, in step 5, 
>> as documents are added to W and segments are merged, documents are 
>> renumbered?  Wouldn't that invalidate F?  Currently we don't permit 
>> one to delete documents from an IndexReader while an IndexWriter is 
>> open, to prevent this sort of thing.  Am I missing something?
> 
> 
> I was assuming that there would never be deletions in S.

Then you need to ensure that you leave the index has no deletions, and 
optimize it if it has any, to remove them.  This is probably most safely 
done as the first step, rather than the last.

I'm not sure this method has many advantages over what Christoph 
orginally suggested in:

http://www.mail-archive.com/lucene-dev%40jakarta.apache.org/msg06165.html

Doug


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message