Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 00AF6780A for ; Tue, 26 Jul 2011 10:58:13 +0000 (UTC) Received: (qmail 14285 invoked by uid 500); 26 Jul 2011 10:58:10 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 13812 invoked by uid 500); 26 Jul 2011 10:57:51 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 13785 invoked by uid 99); 26 Jul 2011 10:57:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Jul 2011 10:57:44 +0000 X-ASF-Spam-Status: No, hits=0.6 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ian.lea@gmail.com designates 209.85.210.176 as permitted sender) Received: from [209.85.210.176] (HELO mail-iy0-f176.google.com) (209.85.210.176) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Jul 2011 10:57:38 +0000 Received: by iyi20 with SMTP id 20so604930iyi.35 for ; Tue, 26 Jul 2011 03:57:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=E3kvwUdho97GxXTlftovcqB95Q55YpWwXBN9IX8SW0c=; b=e0lv42WgVN/1BwKgrMwOTYpzazj+d+5ZgddIGdXZ+HfbNZ3NTR8uOIDZcMxsSrsC5J XC4iGBiFMsLM4mnqGzmi7ZSQ+vOe2fVhQpZg+1UrRz8yspFScG3Qb7Ic3IZNK+hA6VW7 6sfxIgPNq+/Ft5a38AXg1sD/9Cq2XIx/N9IG4= Received: by 10.231.32.138 with SMTP id c10mr3142421ibd.23.1311677837145; Tue, 26 Jul 2011 03:57:17 -0700 (PDT) MIME-Version: 1.0 Received: by 10.231.205.131 with HTTP; Tue, 26 Jul 2011 03:56:57 -0700 (PDT) In-Reply-To: <1311650713067-3199367.post@n3.nabble.com> References: <1311650713067-3199367.post@n3.nabble.com> From: Ian Lea Date: Tue, 26 Jul 2011 11:56:57 +0100 Message-ID: Subject: Re: Strange StopFilter and stop words behaviour To: java-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org I think that passing an empty set or null to StandardAnalyzer should do what you want. There are useful tips at http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2BAC= 8_incorrect_hits.3F. My guess would be that you aren't using a no-stop-words version of StandardAnalyzer at both index and query time. -- Ian. On Tue, Jul 26, 2011 at 4:25 AM, SBS wrote: > My goal is to be able to get meaningful results from search queries that > include some words that are on the default stop words list, especially > "not". =A0I am using the StandardAnalyzer and I have tried passing in nul= l and > an empty set for the set of stop words to use in the constructor hoping t= hat > no words would be stripped but I am getting strange results. > > If I enter a query of just the word "not" I get no matches. =A0If I run a > query with just the word "included" I get lots of matches. =A0If I run th= e > query "not included" (without surrounding quotation marks) I get lots of > matches and the highlighter indicates that "not" is one of the matching > fragments. =A0But if I run the query ""not included"" (with surrounding > quotation marks) I get no matches even though there are many occurrences = in > the content of that exact phrase which were matched when I entered the sa= me > query without the quotation marks. > > What's going on here? =A0Why can't I search for the word "not" by itself = or in > a quote? =A0Similar behaviour happens for other words like "the" but I am > explicitly telling the analyzer not to remove any words (or so I believe)= . > How can I achieve a StandardAnalyzer where every word in the query is > significant? > > Thanks, > > -sbs > > -- > View this message in context: http://lucene.472066.n3.nabble.com/Strange-= StopFilter-and-stop-words-behaviour-tp3199367p3199367.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org