Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7CDCEF38B for ; Mon, 25 Mar 2013 12:50:49 +0000 (UTC) Received: (qmail 7181 invoked by uid 500); 25 Mar 2013 12:50:47 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 7126 invoked by uid 500); 25 Mar 2013 12:50:47 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 7114 invoked by uid 99); 25 Mar 2013 12:50:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Mar 2013 12:50:47 +0000 X-ASF-Spam-Status: No, hits=-0.2 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_REPLYTO_END_DIGIT,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of paul_t100@fastmail.fm designates 66.111.4.25 as permitted sender) Received: from [66.111.4.25] (HELO out1-smtp.messagingengine.com) (66.111.4.25) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 25 Mar 2013 12:50:41 +0000 Received: from compute1.internal (compute1.nyi.mail.srv.osa [10.202.2.41]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id 5E02D2034E for ; Mon, 25 Mar 2013 08:50:20 -0400 (EDT) Received: from frontend2.nyi.mail.srv.osa ([10.202.2.161]) by compute1.internal (MEProxy); Mon, 25 Mar 2013 08:50:20 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=fastmail.fm; h= message-id:date:from:reply-to:mime-version:to:subject :content-type:content-transfer-encoding; s=mesmtp; bh=FmgU/xOcFl uy/93ZRzM04rJya1U=; b=Pzs80v4m1WMOvlOvMQYLvImW8CIRCujmLaI6/WqZeu yhqMvyeGatFwakpX99uxUtPLwi9vdfDJ8uB3xpJDcswC8e7PFaHUlMpc+cBkWGKj /gmVnImZHn1RS11uMEpYV+meHwCMmzNxpkCrfUFKo1PDBUIMIE4gXRVx4m09kBdZ M= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=message-id:date:from:reply-to :mime-version:to:subject:content-type:content-transfer-encoding; s=smtpout; bh=FmgU/xOcFluy/93ZRzM04rJya1U=; b=Ws+JsgsxggQB9fuHZ oC7YixmfkueIKUlpHSmj+LKcmEqFMbWuFUD5iHuBLwYDzs+UQdlPnNJqq0dBQm2l Q9t3jhbSRglQqgwNygbcH4dlmMp/JbEWgJIEC+J3DVCpGTY+WwrngTN5DkYqeGWy a2JlUsKXWYnRfAKInbHlskVuO0= X-Sasl-enc: bV/RbGjrbQMw+Q+tcOjGewytWE+lkov67NXb/5EAzLQk 1364215820 Received: from [192.168.1.67] (unknown [217.155.98.246]) by mail.messagingengine.com (Postfix) with ESMTPA id 02C822000F7 for ; Mon, 25 Mar 2013 08:50:19 -0400 (EDT) Message-ID: <5150480C.3070306@fastmail.fm> Date: Mon, 25 Mar 2013 12:50:20 +0000 From: Paul Taylor Reply-To: paul_t100@fastmail.fm User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:17.0) Gecko/20130307 Thunderbird/17.0.4 MIME-Version: 1.0 To: "java-user@lucene.apache.org" Subject: Using MappingCharFIlter in analyzer breaking wildcard matches Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org I created this simple StripSpacesAndSeparatorsAnalyzer so that it ignores certain characters such as hypens in the field so that I can search for catno:WRATHCD25 catno:WRATHCD-25 and get the same results, and that works (the original value of the field added to the index was WRATHCD-25) However there is a problem with wildcard searching catno:WRATHCD25* works, but catno:WRATHCD-25* does not If I amend the analyzer to comment out the initReader() method then catno:WRATHCD-25* now works but of course catno:WRATHCD25 no longer works. Wham I doing wrong please public class StripSpacesAndSeparatorsAnalyzer extends Analyzer { protected NormalizeCharMap charConvertMap; protected void setCharConvertMap() { NormalizeCharMap.Builder builder = new NormalizeCharMap.Builder(); builder.add(" ",""); builder.add("-",""); builder.add("_",""); builder.add(":",""); charConvertMap = builder.build(); } public StripSpacesAndSeparatorsAnalyzer() { setCharConvertMap(); } @Override protected TokenStreamComponents createComponents(String fieldName, Reader reader) { Tokenizer source = new KeywordTokenizer(reader); TokenStream filter = new LowercaseFilter(source); return new TokenStreamComponents(source, filter); } @Override protected Reader initReader(String fieldName, Reader reader) { return new MappingCharFilter(charConvertMap, reader); } } --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org