lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Dunn <james_h_d...@yahoo.com>
Subject Re: Preventing duplicate document insertion during optimize
Date Sat, 01 May 2004 02:03:40 GMT
Kevin,

I have a similar issue.  The only solution I have been
able to come up with is, after the merge, to open an
IndexReader against the merge index, iterate over all
the docs and delete duplicate docs based on my
"primary key" field.

Jim

--- "Kevin A. Burton" <burton@newsmonster.org> wrote:
> Let's say you have two indexes each with the same
> document literal.  All 
> the fields hash the same and the document is a
> binary duplicate of a 
> different document in the second index.
> 
> What happens when you do a merge to create a 3rd
> index from the first 
> two?  I assume you now have two documents that are
> identical in one 
> index.  Is there any way to prevent this?
> 
> It would be nice to figure out if there's a way to
> flag a field as a 
> primary key so that if it has already added it to
> just skip.
> 
> Kevin
> 
> -- 
> 
> Please reply using PGP.
> 
>     http://peerfear.org/pubkey.asc    
>     
>     NewsMonster - http://www.newsmonster.org/
>     
> Kevin A. Burton, Location - San Francisco, CA, Cell
> - 415.595.9965
>        AIM/YIM - sfburtonator,  Web -
> http://peerfear.org/
> GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D
> 8D04 99F1 4412
>   IRC - freenode.net #infoanarchy | #p2p-hackers |
> #newsmonster
> 
> 

> ATTACHMENT part 2 application/pgp-signature
name=signature.asc




	
		
__________________________________
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs  
http://hotjobs.sweepstakes.yahoo.com/careermakeover 

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message