Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 54430 invoked from network); 15 Apr 2011 16:09:01 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 15 Apr 2011 16:09:01 -0000 Received: (qmail 87290 invoked by uid 500); 15 Apr 2011 16:08:59 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 87231 invoked by uid 500); 15 Apr 2011 16:08:59 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 87223 invoked by uid 99); 15 Apr 2011 16:08:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Apr 2011 16:08:58 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [132.249.20.60] (HELO billthecat.sdsc.edu) (132.249.20.60) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Apr 2011 16:08:51 +0000 Received: from [192.168.11.51] (c-67-180-28-210.hsd1.ca.comcast.net [67.180.28.210]) (authenticated bits=0) by billthecat.sdsc.edu (8.14.2/8.14.1/SDSCrelay/16) with ESMTP id p3FG8RN0005434 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Fri, 15 Apr 2011 09:08:27 -0700 (PDT) Message-ID: <4DA86D72.1000205@sdsc.edu> Date: Fri, 15 Apr 2011 09:08:18 -0700 From: Christopher Condit User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.15) Gecko/20110303 Thunderbird/3.1.9 MIME-Version: 1.0 To: "java-user@lucene.apache.org" CC: Chris Mantle Subject: Re: Can't perform exact match...? References: In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org On 4/11/2011 1:47 AM, Chris Mantle wrote: > Hi, I’m having some trouble with Lucene at the moment. I have a number of unique identifiers that I need to search through. They’re in many different forms, eg. “M”, “MO”, “:MOFB”, “FH..L-O”, etc. All I need to do is an exact prefix search: at the moment, if I type in ‘M’, I get “M”, “MO” and “:MOFB”, and I’d like to avoid getting “:MOFB” until the user actually types in ‘:M’. > > This is with a StandardAnalyzer and a PrefixQuery. I’ve tried many different combinations of analyzer and query. If I use a WhitespaceAnalyzer or a KeywordAnalyzer, I see that tokens are generated in a form that I’d expect (“:MOFB” instead of “mofb”, for instance), but I can’t search with a wildcard: searching with ‘M*’ returns nothing; ‘M’ returns “M” alone. I’ve also tried using ANALYSED and NOT_ANALYSED indexing to no avail. > > Can anyone advise me on how to remedy this? There must be something I’m missing here... I'm not sure why the WhitespaceAnalyzer isn't working for you (perhaps someone else knows), but here's one that does: http://www.pastie.org/1797909 I just wrapped a WhitespaceTokenizer in a LowerCaseFilter and it has the desired effect. Good luck, -Chris --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org