lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cesar Ronchese <ronch...@hotmail.com>
Subject RE: Delete problems O.O
Date Mon, 11 Feb 2008 20:39:46 GMT

Cool man. 

The Hits.id(int) worked fine. Thanks for the detailed info. 

And hopefully your answer is going to usefull for future Google searches. ;)

Cesar



Steven A Rowe wrote:
> 
> Hi Cesar,
> 
> On 02/11/2008 at 2:19 PM, Cesar Ronchese wrote:
>> I'm running problems with document deletion.
>> [...]
>> This simply doesn't delete anything from the Index.
>> 
>> //see the code sample: 
>> //"theFieldName" was previously stored as Field.Store.YES and
>> Field.Index.TOKENIZED. 
>> Term t = new Terms("theFieldName", "theFieldContent");
>> objIndexReader.DeleteDocuments(t);
> 
> (You have two typos here - "new Term/s/" and /D/eleteDocuments() - I
> assume that this is just a transcription error, since you must have gotten
> this code to run...)
> 
> When you construct a Term instance, no analysis will be performed on
> "theFieldContent".  Since "theFieldName" is TOKENIZED, it was analyzed,
> and this is likely where the mismatch is occurring.  From
> <http://lucene.apache.org/java/2_3_0/api/org/apache/lucene/index/IndexReader.html#deleteDocuments(org.apache.lucene.index.Term)>:
> 
>     This is useful if one uses a document field to
>     hold a unique ID string for the document.
> 
> If you're trying to delete documents based on a document ID held as the
> entire value of a field, then you should be using
> Field.Index.UN_TOKENIZED.  From
> http://lucene.apache.org/java/2_3_0/api/org/apache/lucene/document/Field.Index.html#UN_TOKENIZED>:
> 
>    Index the field's value without using an Analyzer,
>    so it can be searched. As no analyzer is used the
>    value will be stored as a single term. This is
>    useful for unique Ids like product numbers.
> 
>> 2) DeleteDocument(numDoc) <== this problem is a woot problem
>> 
>> [...]
>> 
>> I mean, if I call objIndexReader.DeleteDocument(0), it will
>> delete the first document from the entire INDEX, not the
>> first document in the Hits collection. So, it deleted the
>> first documents I have inserted some days ago, in previous
>> indexing sessions.
> 
> Yes, this is how this method is designed to function.  The javadoc
> description is perhaps too brief: "Deletes the document numbered
> 'docNum'".  As you have discovered, "docNum" is the one-up number assigned
> internally by Lucene to each document as it is added to the index.
>  
>> I ask: is there a way to get the correct docNum from the
>> document retrieved in the Hits collection?
> 
> Check out Hits.id(int):
> <http://lucene.apache.org/java/2_3_0/api/org/apache/lucene/search/Hits.html#id(int)>
> 
> The "id" returned by Hits.id(int) is the same thing as the "docNum"
> parameter to IndexReader.deleteDocument(int).
> 
> It sounds like the documentation could benefit from some more discussion
> of the "docNum"/document "id" feature...
> 
> Steve
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Delete-problems----O.O-tp15418663p15420001.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message