Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 13717 invoked from network); 15 Jun 2004 16:20:33 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 15 Jun 2004 16:20:33 -0000 Received: (qmail 87531 invoked by uid 500); 15 Jun 2004 16:20:33 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 87472 invoked by uid 500); 15 Jun 2004 16:20:30 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 87426 invoked by uid 99); 15 Jun 2004 16:20:30 -0000 Received: from [128.143.104.190] (HELO postfix.mail.ehatchersolutions.com) (128.143.104.190) by apache.org (qpsmtpd/0.27.1) with ESMTP; Tue, 15 Jun 2004 09:20:29 -0700 Received: from [127.0.0.1] (localhost [127.0.0.1]) by postfix.mail.ehatchersolutions.com (Postfix) with ESMTP id DB62878ABF8 for ; Tue, 15 Jun 2004 12:20:11 -0400 (EDT) Mime-Version: 1.0 (Apple Message framework v613) In-Reply-To: References: <40CEBB8D.6040304@zilverline.org> <22204136-BEB0-11D8-8A46-000393A564E6@ehatchersolutions.com> Content-Type: text/plain; charset=US-ASCII; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Erik Hatcher Subject: Re: word thrown out in exact phrase with setPhraseSlop(0) Date: Tue, 15 Jun 2004 12:20:10 -0400 To: "Lucene Users List" X-Mailer: Apple Mail (2.613) X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N On Jun 15, 2004, at 11:58 AM, Claude Devarenne wrote: > In my application, when a user searches on an exact phrase like "key > to project" the word "to" gets thrown out. I currently use the query > parser and enclose the user query string in quotes. I did > setPhraseSlop(0) but that did not help. Would this be resolved by > building a phrase query? It only gets thrown out because you're using an analyzer that removes stop words. If you used that same analyzer for indexing, then matches should be made just fine as the same phrase would still have had "to" thrown out. Phrase slop has nothing to do with the analysis of phrases. > I looked at the JUnit tests for phrase query and it looks like I have > to parse the incoming query string, add the terms to the phrase query > and field to search on for each term. Is that correct? How do I > handle complex queries where an exact phrase may be combined with > several boolean connectors as in: > > "relation to trends" AND (pulmonary disease OR emphysema) > > This currently results in the following Query: > +all:"relation trends" +(+all:pulmonary all:disease all:emphysema) What is the issue you're having with the stop word removal? More matches than it should return? If you remove stop words you lose precision, no question. What you are showing is as expected given stop word removal during analysis. Erik --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org