lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Haxby <>
Subject Re: Duplicate Hits
Date Tue, 01 Feb 2005 15:05:50 GMT
Jerry Jalenak wrote:

>Given Erik's response of 'don't put duplicate documents in the index', how
>can I accomplish this in the IndexWriter?
I was dealing with a similar requirement recently.   I eventually 
decided on storing the MD5 checksum of the document as a keyword.   It 
means reading it twice (once to calculate the checksum, once to index 
it), but it seems to do the trick.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message