lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Libbrecht <p...@hoplahup.net>
Subject Re: Implement Custom Soundex
Date Sun, 23 Oct 2011 08:58:49 GMT
Momo,

if you have the conversion text to tokens then all you need to do is implement a custom analyzer,
deploy it inside the solr webapp, then plug it into the schema.

Is that the part that is hard?
I thought the wiki was helpful there but may some other issue is holding you.
One zoology of such analyzers is at:
	http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

If that is the issue, here's a one sentence explanation: if you have a new analyzer you want
to declare a new field-type and field with that analyzer; queries should be going through
it as well as indexing. Matching word A with word B will then happen if word A and B are converted
by your analyzer to the same token (this is how cat and cats match when using the PorterStemmer
for example).

paul


Le 16 oct. 2011 à 14:09, Momo..Lelo .. a écrit :

> 
> Dear Gora, 
> 
> Thank you for the quick response. 
> 
> Actually I 
> need to do Soundex for Arabic language. The code is already done in Java. But I 
> couldn't understand how can I implement it as Solr filter. 
> 
> Regards,
> 
> 
> 
>> From: gora@mimirtech.com
>> Date: Sun, 16 Oct 2011 16:19:48 +0530
>> Subject: Re: Implement Custom Soundex
>> To: solr-user@lucene.apache.org
>> 
>> 2011/10/16 Momo..Lelo .. <galag999@hotmail.com>:
>>> 
>>> Dear,
>>> 
>>> Does anyone there has an experience of developing a custom Soundex.
>>> 
>>> If you have an experience doing this and can offer some help and share experience
I'd really appreciate it.
>> 
>> I presume that this is in the context of Solr, and spell-checking.
>> We did this as an exercise for Indian-language words transliterated
>> into English, hooking into the open-source spell-checking library,
>> aspell, which provided us  with a soundex-like algorithm (the actual
>> algorithm is quite different, but works better than soundex, at
>> least for our use case). We were quite satisfied with the results,
>> though unfortunately this never went into production.
>> 
>> Would be glad to help, though I am going to be really busy the
>> next few days. Please do provide us with more details on your
>> requirements.
>> 
>> Regards,
>> Gora
> 		 	   		  


Mime
View raw message