lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <>
Subject Re: problems with deleteDocuments
Date Wed, 04 Jul 2007 14:30:01 GMT
This is exactly the behavior I'd expect.

Consider what would happen otherwise. Say you have documents
with the following values for a field (call it blah).
some data
some data I put in the index
lots of data

Then I don't want deleting on the term blah:data to remove all
of them. Which seems to be what you're asking. Even if
you restricted things to "phrases", then deleting on the term
'blah:some data' would remove two documents.

So, while UN_TOKENIZED isn't a *requirement*, exact total term
matches *is* the requirement. By that, I meant that whatever
goes into the field must not be broken into pieces by the indexing
tokenizer for deletes to work as you expect.


On 7/4/07, Nick Johnson <> wrote:
> I'm having several problems with deleting documents with Lucene 2.2.
> Via the IndexWriter, I can successfully delete a document by its primary
> key via a Term, but ONLY if the field was stored as
> Field.Index.UN_TOKENIZED.  If it was stored as TOKENIZED, the debug output
> says it is deleting the document, but subsequent searches executed against
> terms that existed only in the original document (and not the one I add to
> replace it) still return the deleted document.  (Searches against terms
> only in the new document will return the new document.)
> Another problem is that if I first delete the document using
> deleteDocuments(Term) and then add a new document that happens to have
> identical fields to the one I deleted, the new document is not added.  Of
> course this operation is fairly wasteful, but it seems like the new
> document should replace the old one, even though they're the same.  This
> happens even if I perform a flush() after the delete and before the add.
> It will also happen even if I flush, close the IndexWriter and create a
> new IndexWriter.  It seems that once a document with a particular set of
> fields has been deleted, an identical one can never be re-added.
> Any pointers or things I should check or more detailed information I can
> provide to track this down?
>    Nick
> --
> "Courage isn't just a matter of not being frightened, you know. It's being
> afraid and doing what you have to do anyway."
>    Doctor Who - Planet of the Daleks
> This message has been brought to you by Nick Johnson 2.3b1 and the number
> 6.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message