lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cesar Ronchese <>
Subject RE: Delete problems O.O
Date Mon, 11 Feb 2008 20:39:46 GMT

Cool man. 

The worked fine. Thanks for the detailed info. 

And hopefully your answer is going to usefull for future Google searches. ;)


Steven A Rowe wrote:
> Hi Cesar,
> On 02/11/2008 at 2:19 PM, Cesar Ronchese wrote:
>> I'm running problems with document deletion.
>> [...]
>> This simply doesn't delete anything from the Index.
>> //see the code sample: 
>> //"theFieldName" was previously stored as Field.Store.YES and
>> Field.Index.TOKENIZED. 
>> Term t = new Terms("theFieldName", "theFieldContent");
>> objIndexReader.DeleteDocuments(t);
> (You have two typos here - "new Term/s/" and /D/eleteDocuments() - I
> assume that this is just a transcription error, since you must have gotten
> this code to run...)
> When you construct a Term instance, no analysis will be performed on
> "theFieldContent".  Since "theFieldName" is TOKENIZED, it was analyzed,
> and this is likely where the mismatch is occurring.  From
> <>:
>     This is useful if one uses a document field to
>     hold a unique ID string for the document.
> If you're trying to delete documents based on a document ID held as the
> entire value of a field, then you should be using
> Field.Index.UN_TOKENIZED.  From
>    Index the field's value without using an Analyzer,
>    so it can be searched. As no analyzer is used the
>    value will be stored as a single term. This is
>    useful for unique Ids like product numbers.
>> 2) DeleteDocument(numDoc) <== this problem is a woot problem
>> [...]
>> I mean, if I call objIndexReader.DeleteDocument(0), it will
>> delete the first document from the entire INDEX, not the
>> first document in the Hits collection. So, it deleted the
>> first documents I have inserted some days ago, in previous
>> indexing sessions.
> Yes, this is how this method is designed to function.  The javadoc
> description is perhaps too brief: "Deletes the document numbered
> 'docNum'".  As you have discovered, "docNum" is the one-up number assigned
> internally by Lucene to each document as it is added to the index.
>> I ask: is there a way to get the correct docNum from the
>> document retrieved in the Hits collection?
> Check out
> <>
> The "id" returned by is the same thing as the "docNum"
> parameter to IndexReader.deleteDocument(int).
> It sounds like the documentation could benefit from some more discussion
> of the "docNum"/document "id" feature...
> Steve
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

View this message in context:
Sent from the Lucene - Java Users mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message