Return-Path: Delivered-To: apmail-lucene-solr-user-archive@locus.apache.org Received: (qmail 27048 invoked from network); 15 Jan 2009 23:32:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Jan 2009 23:32:39 -0000 Received: (qmail 39388 invoked by uid 500); 15 Jan 2009 23:32:33 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 39357 invoked by uid 500); 15 Jan 2009 23:32:33 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 39342 invoked by uid 99); 15 Jan 2009 23:32:33 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Jan 2009 15:32:33 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.69.42.181] (HELO radix.cryptio.net) (208.69.42.181) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Jan 2009 23:32:25 +0000 Received: by radix.cryptio.net (Postfix, from userid 1007) id 1F45C71C208; Thu, 15 Jan 2009 15:32:05 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by radix.cryptio.net (Postfix) with ESMTP id 1BC5071C205 for ; Thu, 15 Jan 2009 15:32:05 -0800 (PST) Date: Thu, 15 Jan 2009 15:32:05 -0800 (PST) From: Chris Hostetter To: solr-user@lucene.apache.org Subject: Re: delta index produces multiple results? In-Reply-To: <861D96B1-733B-4A9A-B9FF-18D4DB78A299@fotocommunity.net> Message-ID: References: <861D96B1-733B-4A9A-B9FF-18D4DB78A299@fotocommunity.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Checked: Checked by ClamAV on apache.org : Full index is working fine, in schema.xml I implemented a uniqueKey field : (which is of the type 'text'). using "text" as the fieldtype for a uniqueKey is almost never a good idea. it could easily explain the behavior you are seeing. DataImportHandler (and all of hte update handlers) relies on the underlying UpdateProcessor to delete docs with identical uniqueKeys when you "update" an existing document ... if the uniqueKey field has an analyzer that produces multiple tokens (TextField frequently does) then the behavior becomes undefined. stick something like StrField, or IntField for your uniqueKeyField ... or if you must use TextField make sure you are using the KeywordTokenizer. if changing this still causes problems, then we'll need to see your schema.xml your data-config.xml, and the output of doing a search where you get some duplicaitons like this to help figure out what else might be going wrong. -Hoss