Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 31839 invoked from network); 17 Sep 2005 15:44:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 17 Sep 2005 15:44:39 -0000 Received: (qmail 64077 invoked by uid 500); 17 Sep 2005 15:44:35 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 64043 invoked by uid 500); 17 Sep 2005 15:44:35 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 64030 invoked by uid 99); 17 Sep 2005 15:44:35 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 Sep 2005 08:44:35 -0700 X-ASF-Spam-Status: No, hits=0.1 required=10.0 tests=HTML_30_40,HTML_MESSAGE,RCVD_BY_IP,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of jeff.rodenburg@gmail.com designates 64.233.162.197 as permitted sender) Received: from [64.233.162.197] (HELO zproxy.gmail.com) (64.233.162.197) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 Sep 2005 08:44:44 -0700 Received: by zproxy.gmail.com with SMTP id q3so164162nzb for ; Sat, 17 Sep 2005 08:44:31 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:in-reply-to:mime-version:content-type:references; b=k0tioHyOU+UBSemGlnFJPYRBBKLIJxaAAGwx9unoZBfMej7MIlzuoeWFPc48ODrKXMSB3C06jZF1nUZjOiXUOA0aBIbjFYoz/PIztozlqmzZLiEtX2xH1I0LxtkrQL0fNW0dy1uMRzl9VW+yiOQicOtsqXj0j+smjywct6nWu6Q= Received: by 10.54.10.67 with SMTP id 67mr205137wrj; Sat, 17 Sep 2005 08:44:31 -0700 (PDT) Received: by 10.54.13.32 with HTTP; Sat, 17 Sep 2005 08:44:31 -0700 (PDT) Message-ID: <50f4333605091708441c2130aa@mail.gmail.com> Date: Sat, 17 Sep 2005 08:44:31 -0700 From: Jeff Rodenburg Reply-To: jeff.rodenburg@gmail.com To: java-user@lucene.apache.org, ben.d.gill@gmail.com Subject: Re: Stopping Duplicates In-Reply-To: <846717200509170203337542a0@mail.gmail.com> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_10682_17399101.1126971871618" References: <846717200509170203337542a0@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N ------=_Part_10682_17399101.1126971871618 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Ben - I can think of two ways to achieve this. 1) While adding your information to the index, query the index for an=20 existing record. If you get no match, add the record. 2) Control the exclusivity requirement from your data source, so that no=20 duplicate records ever have the opportunity to be indexed. This is an operational question, so the *best* way depends on your overall= =20 operation, as both of these approaches have consequences on index=20 maintenance operations. Hope this helps. -- jeff On 9/17/05, Ben Gill wrote: >=20 > Hi, >=20 > I am storing names in my index, and am currently getting duplicates > back (quite correctly, on Lucene's part), because I am storing: >=20 > id name > 1 fred > 2 fred >=20 > What I want to happen is, if a duplicate name is added to the index, I > only ever want one entity to exist with the name.... >=20 > What is the best way for me to achieve this? >=20 > Thanks >=20 > Ben >=20 > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org >=20 > ------=_Part_10682_17399101.1126971871618--