Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 14EA57CF5 for ; Wed, 9 Nov 2011 13:21:31 +0000 (UTC) Received: (qmail 71752 invoked by uid 500); 9 Nov 2011 13:21:28 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 71706 invoked by uid 500); 9 Nov 2011 13:21:28 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 71698 invoked by uid 99); 9 Nov 2011 13:21:28 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Nov 2011 13:21:28 +0000 X-ASF-Spam-Status: No, hits=0.7 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [212.227.17.10] (HELO moutng.kundenserver.de) (212.227.17.10) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Nov 2011 13:21:20 +0000 Received: from [192.168.178.27] (p5DDECCFB.dip0.t-ipconnect.de [93.222.204.251]) by mrelayeu.kundenserver.de (node=mrbap1) with ESMTP (Nemesis) id 0MIyAj-1RMCH425Or-002daY; Wed, 09 Nov 2011 14:20:59 +0100 Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Apple Message framework v1084) Subject: Re: Phonetic search with Lucene 3.2 From: Paul Libbrecht In-Reply-To: Date: Wed, 9 Nov 2011 14:20:58 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: <4A5A76AB-AA58-4A5B-9DF3-4EB76C7232F0@hoplahup.net> <20F12225-21C9-479D-839B-A8D5D0F4BD9F@gmail.com> <56C49D51-4C20-48DE-AECF-1191FC7C69C4@hoplahup.net> To: java-user@lucene.apache.org X-Mailer: Apple Mail (2.1084) X-Provags-ID: V02:K0:sy+M94iKRN7BAvBOtx9eV4fNIeKyp/Df66n6lgVw08n vkZ5bNMCaggc05OO1X3YYihYvsfk/emgL83gYYpxZUTVj1Sy9Z /OqPccmICITMIs7S7rfP/YE1hJxCrTtaNtMNPhbgTbSIy0+34y W7nx0/VdiklnjflqPN00Jor6Ma02peKYqdWLRzB0Pv3ILxaM5J dRav3H6J9SHiXKEZmHFrirL21Bj6GFw/Ru31JUfHIQnIuhpS6i mMBpzLRRzBotwLCWeaBMpxMVaNFv6jVWfkvezxNRDE+lIaxBRQ KGe6skSgCNhKJnrYFxWUw07pgoJnC9GxgFrerme7o1TZAxk6Mw 6CihvQjGe6T340JOFZ1DHXcqL0gWvGq6afAH1s67I3IfjSa2UJ mEHZx/wE8vxXw== X-Virus-Checked: Checked by ClamAV on apache.org That uses Lucene 2.9.2 indeed. paul Le 9 nov. 2011 =E0 11:43, Felipe Carvalho a =E9crit : > Which version of Lucene are you using? I had tried it with Lucene 3.3 = and > had some problems, did you have to do any customizations? >=20 > On Wed, Nov 9, 2011 at 8:38 AM, Paul Libbrecht = wrote: >=20 >> We've been using >> http://www.tangentum.biz/en/products/phonetix/ >> which does double-metaphone. >> Maybe that helps. >>=20 >> paul >>=20 >>=20 >> Le 9 nov. 2011 =E0 11:29, Felipe Carvalho a =E9crit : >>=20 >>> Using PerFieldAnalyzerWrapper seems to be working for what I need! >>>=20 >>> On indexing: >>>=20 >>> PerFieldAnalyzerWrapper wrapper =3D new = PerFieldAnalyzerWrapper(new >>> StandardAnalyzer(Version.LUCENE_33)); >>> wrapper.addAnalyzer("nome", new = MetaphoneReplacementAnalyzer()); >>> IndexWriterConfig indexWriterConfig =3D new >>> IndexWriterConfig(Version.LUCENE_33, wrapper); >>> Directory directory =3D FSDirectory.open(new File(indexPath)); >>> IndexWriter indexWriter =3D new IndexWriter(directory, >>> indexWriterConfig); >>>=20 >>> On search: >>>=20 >>> Directory directory =3D FSDirectory.open(new >>> File(lastIndexDir(Calendar.getInstance()))); >>> IndexSearcher is =3D new IndexSearcher(directory); >>> PerFieldAnalyzerWrapper wrapper =3D new = PerFieldAnalyzerWrapper(new >>> StandardAnalyzer(Version.LUCENE_33)); >>> wrapper.addAnalyzer("name", new = MetaphoneReplacementAnalyzer()); >>> QueryParser parser =3D new QueryParser(Version.LUCENE_33, = "name", >>> wrapper); >>> Query query =3D parser.parse(expression); >>> ScoreDoc[] hits =3D is.search(query, 1000).scoreDocs; >>>=20 >>> Does anyone know any other phonetic analyzer implementation? I'm = using >>> MetaphoneReplacementAnalyzer from LIA examples. >>>=20 >>> I'm looking at lucene-contrib stuff at >>> http://lucene.apache.org/java/3_4_0/lucene-contrib/index.html but I >> can't >>> seem to find other phonetic analyzers. >>>=20 >>> Thanks! >>>=20 >>>=20 >>> On Tue, Nov 8, 2011 at 12:19 PM, Erik Hatcher = >> wrote: >>>=20 >>>>=20 >>>> On Nov 8, 2011, at 05:42 , Felipe Carvalho wrote: >>>>>> Yes, quite possible, including boosting on exact matches if you = want. >>>> Use >>>>>> a BooleanQuery to wrap clauses parsed once with phonetic = analysis, and >>>> once >>>>>> without, including fields at indexing time for both too of = course. >>>>>>=20 >>>>>=20 >>>>> Would it be possible to point to an example where this is done. = The >> best >>>>> example of a BooleanQuery I've found so far is this one: >>>>>=20 >>>>=20 >> = http://www.avajava.com/tutorials/lessons/how-do-i-combine-queries-with-a-b= oolean-query.html >>>>>=20 >>>>> But I couldn't find a boolean query using different analyzers for >>>> different >>>>> fields of the attribute. >>>>=20 >>>> You could use two different QueryParser instances with different >>>> analyzers. Or use the PerFieldAnalyzerWrapper, though you'll still >> need to >>>> instances in order to have a different default field for each >> expression. >>>> But then use the techniques you saw in that article (or in Lucene = in >>>> Action, since you mentioned having that) to combine Query objects = into a >>>> BooleanQuery. >>>>=20 >>>> Erik >>>>=20 >>>>=20 >>>> = --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>>> For additional commands, e-mail: java-user-help@lucene.apache.org >>>>=20 >>>>=20 >>=20 >>=20 >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >>=20 >>=20 --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org