lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Haxby <...@scalix.com>
Subject Re: Duplicate Hits
Date Tue, 01 Feb 2005 15:05:50 GMT
Jerry Jalenak wrote:

>Given Erik's response of 'don't put duplicate documents in the index', how
>can I accomplish this in the IndexWriter?
>  
>
I was dealing with a similar requirement recently.   I eventually 
decided on storing the MD5 checksum of the document as a keyword.   It 
means reading it twice (once to calculate the checksum, once to index 
it), but it seems to do the trick.

jch

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message