Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 818 invoked from network); 14 Dec 2009 21:37:46 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 14 Dec 2009 21:37:46 -0000 Received: (qmail 93382 invoked by uid 500); 14 Dec 2009 21:37:44 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 93277 invoked by uid 500); 14 Dec 2009 21:37:42 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 93267 invoked by uid 99); 14 Dec 2009 21:37:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Dec 2009 21:37:42 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=BAYES_00,HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of akaris@gmail.com designates 209.85.211.185 as permitted sender) Received: from [209.85.211.185] (HELO mail-yw0-f185.google.com) (209.85.211.185) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Dec 2009 21:37:40 +0000 Received: by ywh15 with SMTP id 15so3502724ywh.5 for ; Mon, 14 Dec 2009 13:37:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:from:date:message-id :subject:to:content-type; bh=WSVwM+AVhdjcaEomPOKzI1t1265hMzESMA35kL462Uw=; b=SPiP0qtA4uE2auhy7iH3VpRZLqLmEw9ClQ6Ly2hNCyxoPjfqsIvEhTXT3UNFCtb5Y+ M3P3Q8xCJl5HGCXxQpUW+F2lvkLD4pWyQF0iHyZJr3uBU1jxub0eztWsUr3nAFmeda/w PQENRWW4/lgMWaaXsUxaIKJPS/2MmKwIhQrXs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type; b=Re2pirGdPzi97XrbvjvxgkPi006lOV4VaG0Owo+Qe273llOyy41GUlOs2FexBUquaj jbcLtGPevzedSC+Wydrh3rwKJlwSJ1cWIUFB4zoYI1CfDIpiovJpFriX1fWeQT/Z9Odz jsZgLD6IJtAJY573nhUsgSaHY7YSkQX/gAM9E= MIME-Version: 1.0 Received: by 10.101.181.21 with SMTP id i21mr7903551anp.141.1260826639278; Mon, 14 Dec 2009 13:37:19 -0800 (PST) From: Michel Nadeau Date: Mon, 14 Dec 2009 16:36:59 -0500 Message-ID: <962c93f60912141336q173c3626l175892d82dedae85@mail.gmail.com> Subject: Lower/Uppercase problem when searching in a not-analyzed field To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=0016e6d261d5f5212d047ab713c3 --0016e6d261d5f5212d047ab713c3 Content-Type: text/plain; charset=ISO-8859-1 Hi ! My Lucene 3.0.0 index contains a field "DOMAIN" that contains an Internet domain name - like * www.DomainName.com * www.domainname.com * www.DomainName.com/path/to/document/doc.html?a=2 This field is indexed like this - doc.add(new Field("DOMAIN", sValue, Field.Store.YES, Field.Index.NOT_ANALYZED)); When I search in this field, my search query looks like this: DOMAIN:www.DomainName* My problem is that it seems it never returns domains with uppercase letters. For example, I display all documents (using ConstantScoreQuery), and see this domain name: www.BidClerk.com ...So I know it's there - and so I search for: DOMAIN:www.BidC* - well it will *never* be found ! But whatever all-lowecase domain will be found, all the time. My guess is that the problem is the analyzer I'm using - a StandadAnalyzer: QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, "content", new StandardAnalyzer(Version.LUCENE_CURRENT)); q = parser.parse(QUERY); So here are my questions: * Should I use a KeywordAnalyzer instead? * If I have domains like WWW.ASK.COM, www.ask.com, www.Ask.com, WwW.AsK.CoM- and I search for "DOMAIN: www.ask.com" ; will they all be found whatever the case? Thanks! - Mike akaris@gmail.com --0016e6d261d5f5212d047ab713c3--