lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kalani Ruwanpathirana" <>
Subject Re: Deleted document terms
Date Tue, 26 Aug 2008 09:16:21 GMT
Hi John,

Are you sure you made the id "tokenized" while indexing? I could overcome
this issue by having a tokenized field, which was used for the deletion as

document.add(new Field("id", id, Field.Store.YES, *Field.Index.TOKENIZED*));


On Tue, Aug 26, 2008 at 2:15 PM, Michael McCandless <> wrote:

> John Patterson wrote:
>  I just discovered some strange behaviour with deleted documents.  I do a
>> search for documents with a certain query and delete one using
>> IndexWriter.deleteDocuments(Term) using a key for the term.  Then I repeat
>> the search and the document is still there because I use a custom
>> HitCollector which does not check IndexReader.isDeleted(int).  That is all
>> expected.
> Hmm -- once a document is deleted, your HitCollector won't ever see it.
>  During searching, isDeleted is called per document at a very low level.
> If your HitCollector is seeing it, it sounds like it wasn't really deleted.
>  Are you sure you closed the IndexWriter and then reopened your searcher, so
> that the searcher will see the deletion?
>  But when I try to show the deleted document by searching by key using the
>> same term it was deleted with, it is not found.  So it seems that the term
>> (id:MYKEY) is removed from the index.
> This is odd -- the document should either be deleted (entirely), or not.
>  You shouldn't get different behavior if you search for the doc one way vs
> another.
>  So I was surprised that the term for the id was removed but not the other
>> terms for document.
> That make two of us!
> Mike
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Kalani Ruwanpathirana
Department of Computer Science & Engineering
University of Moratuwa

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message