commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Gregory" <ggreg...@seagullsoftware.com>
Subject RE: [codec]Implementing support for additional non-english vowels in double metaphone
Date Sat, 28 Oct 2006 19:01:04 GMT
Hello: Steinar:

The current DoubleMetaphone implementation (released and SVN) allows for
Spanish and Germanic characters, so adding support for other languages
in the same class seems to be in the spirit of the current
implementation. 

I would also say that having language-specific implementation sure
sounds like a reasonable idea. I wonder if there are some performance
issues with the current implementation attempting to work for all
languages. It seems like a bigger topic though and might be worth
discussing separately if the list is interested.

So I would say: Create a JIRA ticket [1] Go ahead and submit patches [2]
for the code *and* unit tests based on the SVN code [3].

Thank you,
Gary

[1] https://issues.apache.org/jira/browse/CODEC
[2] http://jakarta.apache.org/commons/patches.html
[3] http://jakarta.apache.org/commons/codec/cvs-usage.html

> -----Original Message-----
> From: Steinar Cook [mailto:steinar@balder.no]
> Sent: Monday, October 23, 2006 1:35 PM
> To: commons-dev@jakarta.apache.org
> Subject: [codec]Implementing support for additional non-english vowels
in double
> metaphone
> 
> I have made some modifications to
> org.apache.commons.codec.language.DoubleMetaphone in order to support
> the three additional Norwegian and Danish vowels.  The current
> implementation at Jakarta does not provide any methods to specify the
> language of the input text.
> 
> Is it all right to modify DoubleMetaphone to support the Scandinavian
> vowels (Swedish, Danish and Norwegian) and possibly other languages
> or have I completely misunderstood the idea behind the double
> metaphone algorithm? That is, should double metaphone detect various
> language constructs automatically or is it perhaps a better idea to
> have a factory which returns a double metaphone implementation
> appropriate for the language?
> 
> Any suggestions?
> 
> I would like to contribute any changes back to Jakarta commons-codec,
> of course.
> 
> 
> Steinar Cook
> steinar@balder.no
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message