Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A341E91EE for ; Sun, 23 Oct 2011 08:59:24 +0000 (UTC) Received: (qmail 55775 invoked by uid 500); 23 Oct 2011 08:59:21 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 55600 invoked by uid 500); 23 Oct 2011 08:59:21 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 55582 invoked by uid 99); 23 Oct 2011 08:59:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 23 Oct 2011 08:59:19 +0000 X-ASF-Spam-Status: No, hits=0.7 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [212.227.17.9] (HELO moutng.kundenserver.de) (212.227.17.9) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 23 Oct 2011 08:59:12 +0000 Received: from [192.168.178.27] (p5DDEC3F4.dip0.t-ipconnect.de [93.222.195.244]) by mrelayeu.kundenserver.de (node=mreu1) with ESMTP (Nemesis) id 0M1CM0-1R2Iug0lrs-00tjKR; Sun, 23 Oct 2011 10:58:50 +0200 Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Apple Message framework v1084) Subject: Re: Implement Custom Soundex From: Paul Libbrecht In-Reply-To: Date: Sun, 23 Oct 2011 10:58:49 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <441D7158-C100-4334-9D21-C534A1B4D954@hoplahup.net> References: To: solr-user@lucene.apache.org X-Mailer: Apple Mail (2.1084) X-Provags-ID: V02:K0:3fm9hUR4RVhf4me7rkqtpKZ8XbNinNDFtTQqFZrGwJP mO7G7Cm5MiECbEbuCoyS0Rbwah0BVduB6n6FA8B52UCewT/Tvs v3BYONcw6vqjjF9nSsy9XQjaTf6Sh3bxoV7y2K72a2zxisvXou BLwxuL5Aeu2KA01w1Wxs1ueB/vfBLALi2a4wlJvTubg84SyDx6 r6oMQONHh0yFMBX8lSHnNbe5SzFK9mc48L97T3EVBaZEcJCdiw /5uxsCvStufrBDM5fQ/o5jYMr+F+CDfhdLUXKZexdhnj2M8bB/ u/3vbBtvh2Mgu/Xnz+IrJ+LokbAzrm+sqByv2zms2pmrSr2Tdl dqJ9it5E/BjnhuKHpKBDx6z3ZLI47i5wgoD3nG1McUnBCJpPpT E9AuZ4SBzJYgQ== Momo, if you have the conversion text to tokens then all you need to do is = implement a custom analyzer, deploy it inside the solr webapp, then plug = it into the schema. Is that the part that is hard? I thought the wiki was helpful there but may some other issue is holding = you. One zoology of such analyzers is at: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters If that is the issue, here's a one sentence explanation: if you have a = new analyzer you want to declare a new field-type and field with that = analyzer; queries should be going through it as well as indexing. = Matching word A with word B will then happen if word A and B are = converted by your analyzer to the same token (this is how cat and cats = match when using the PorterStemmer for example). paul Le 16 oct. 2011 =E0 14:09, Momo..Lelo .. a =E9crit : >=20 > Dear Gora,=20 >=20 > Thank you for the quick response.=20 >=20 > Actually I=20 > need to do Soundex for Arabic language. The code is already done in = Java. But I=20 > couldn't understand how can I implement it as Solr filter.=20 >=20 > Regards, >=20 >=20 >=20 >> From: gora@mimirtech.com >> Date: Sun, 16 Oct 2011 16:19:48 +0530 >> Subject: Re: Implement Custom Soundex >> To: solr-user@lucene.apache.org >>=20 >> 2011/10/16 Momo..Lelo .. : >>>=20 >>> Dear, >>>=20 >>> Does anyone there has an experience of developing a custom Soundex. >>>=20 >>> If you have an experience doing this and can offer some help and = share experience I'd really appreciate it. >>=20 >> I presume that this is in the context of Solr, and spell-checking. >> We did this as an exercise for Indian-language words transliterated >> into English, hooking into the open-source spell-checking library, >> aspell, which provided us with a soundex-like algorithm (the actual >> algorithm is quite different, but works better than soundex, at >> least for our use case). We were quite satisfied with the results, >> though unfortunately this never went into production. >>=20 >> Would be glad to help, though I am going to be really busy the >> next few days. Please do provide us with more details on your >> requirements. >>=20 >> Regards, >> Gora > =20