Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 78101 invoked from network); 4 Jul 2007 16:02:00 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 4 Jul 2007 16:02:00 -0000 Received: (qmail 45883 invoked by uid 500); 4 Jul 2007 16:01:55 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 45851 invoked by uid 500); 4 Jul 2007 16:01:55 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 45840 invoked by uid 99); 4 Jul 2007 16:01:55 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Jul 2007 09:01:55 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of yhprar@spatula.net designates 208.96.51.42 as permitted sender) Received: from [208.96.51.42] (HELO turing.morons.org) (208.96.51.42) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Jul 2007 09:01:50 -0700 Received: by turing.morons.org (Postfix, from userid 1001) id B671B17030; Wed, 4 Jul 2007 09:01:29 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by turing.morons.org (Postfix) with ESMTP id 8B4EA1702C for ; Wed, 4 Jul 2007 09:01:29 -0700 (PDT) Date: Wed, 4 Jul 2007 09:01:29 -0700 (PDT) From: Nick Johnson X-X-Sender: spatula@turing To: java-user@lucene.apache.org Subject: Re: problems with deleteDocuments In-Reply-To: <359a92830707040730y31ef1934u93784df70e9c40de@mail.gmail.com> Message-ID: <20070704085616.E82369@turing> References: <20070704004859.J82369@turing> <359a92830707040730y31ef1934u93784df70e9c40de@mail.gmail.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Checked: Checked by ClamAV on apache.org I think I follow you. I don't have a problem with storing something like a primary key as UN_TOKENIZED, though I'm a bit baffled about why it didn't work as TOKENIZED, since the _only_ thing in that field is the value of the primary key (ie, the string value of some integer). It seems like it should have matched exactly either way...unless perhaps the StopAnalyzer is tokenizing the primary key strangely. What still confounds me is the second problem- where adding a new document that has identical fields to a deleted document fails to store the new document. On Wed, 4 Jul 2007, Erick Erickson wrote: > This is exactly the behavior I'd expect. > > Consider what would happen otherwise. Say you have documents > with the following values for a field (call it blah). > some data > some data I put in the index > lots of data > data > > Then I don't want deleting on the term blah:data to remove all > of them. Which seems to be what you're asking. Even if > you restricted things to "phrases", then deleting on the term > 'blah:some data' would remove two documents. > > So, while UN_TOKENIZED isn't a *requirement*, exact total term > matches *is* the requirement. By that, I meant that whatever > goes into the field must not be broken into pieces by the indexing > tokenizer for deletes to work as you expect. > > Best > Erick -- "Courage isn't just a matter of not being frightened, you know. It's being afraid and doing what you have to do anyway." Doctor Who - Planet of the Daleks This message has been brought to you by Nick Johnson 2.3b1 and the number 6. http://healerNick.com/ http://morons.org/ http://spatula.net/ --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org