lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Serebrennikov <dmit...@earthlink.net>
Subject Re: Deleting a document with an IndexWriter open
Date Tue, 20 Jul 2004 18:39:48 GMT
Doug Cutting wrote:

>
> Then you need to ensure that you leave the index has no deletions, and 
> optimize it if it has any, to remove them.  This is probably most 
> safely done as the first step, rather than the last.

Good point. I didn't think about this.

>
> I'm not sure this method has many advantages over what Christoph 
> orginally suggested in:
>
> http://www.mail-archive.com/lucene-dev%40jakarta.apache.org/msg06165.html

Yes, I agree that it's not too different. The main benefit I see, and I 
think this may be significant for some applications, is that in 
Christoph's original method new documents must be iterated over twice - 
in his steps 2 and 4. This may be a problem for some applications 
because it requires buffering newly arrived documents somewhere - 
something that Lucene will not directly help with. That means people may 
have to write substantial external code to support this usage (or 
perhaps use a database, file system, etc).

With the modification I'm proposing, the documents can be added to the 
index as they arrive. No buffering is required and documents are handled 
exactly once. The "buffering" occurs instead on document ids to be 
deleted, which is much easier to do and one can even use the BitSet 
class (or Filter) supplied with Lucene.

>
> Doug
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message