Normally an ID should be indexed as Field.Index.UN_TOKENIZED.
Mike
John Patterson wrote:
>
> That was the problem - the id was not tokenized. Thanks for your
> help.
>
>
> Kalani Ruwanpathirana wrote:
>>
>> Hi John,
>>
>> Are you sure you made the id "tokenized" while indexing? I could
>> overcome
>> this issue by having a tokenized field, which was used for the
>> deletion as
>> below.
>>
>> document.add(new Field("id", id, Field.Store.YES,
>> *Field.Index.TOKENIZED*));
>>
>>
>>
>> Thanks
>>
>>
>>
>> On Tue, Aug 26, 2008 at 2:15 PM, Michael McCandless <
>> lucene@mikemccandless.com> wrote:
>>
>>>
>>>
>>> John Patterson wrote:
>>>
>>> I just discovered some strange behaviour with deleted documents.
>>> I do a
>>>> search for documents with a certain query and delete one using
>>>> IndexWriter.deleteDocuments(Term) using a key for the term. Then I
>>>> repeat
>>>> the search and the document is still there because I use a custom
>>>> HitCollector which does not check IndexReader.isDeleted(int).
>>>> That is
>>>> all
>>>> expected.
>>>>
>>>
>>> Hmm -- once a document is deleted, your HitCollector won't ever
>>> see it.
>>> During searching, isDeleted is called per document at a very low
>>> level.
>>>
>>> If your HitCollector is seeing it, it sounds like it wasn't really
>>> deleted.
>>> Are you sure you closed the IndexWriter and then reopened your
>>> searcher,
>>> so
>>> that the searcher will see the deletion?
>>>
>>> But when I try to show the deleted document by searching by key
>>> using
>>> the
>>>> same term it was deleted with, it is not found. So it seems that
>>>> the
>>>> term
>>>> (id:MYKEY) is removed from the index.
>>>>
>>>
>>> This is odd -- the document should either be deleted (entirely),
>>> or not.
>>> You shouldn't get different behavior if you search for the doc one
>>> way
>>> vs
>>> another.
>>>
>>> So I was surprised that the term for the id was removed but not the
>>> other
>>>> terms for document.
>>>>
>>>
>>> That make two of us!
>>>
>>> Mike
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>>
>> --
>> Kalani Ruwanpathirana
>> Department of Computer Science & Engineering
>> University of Moratuwa
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Deleted-document-terms-tp19157027p19158657.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
|