commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Gregory <garydgreg...@gmail.com>
Subject Re: [CODEC] Beider Morse Phonetic Matching (BMPM) and Daitch-Mokotoff Soundex (DM)
Date Fri, 13 Jun 2014 20:33:06 GMT
I can help shepherd Java patches into the code base if and when appropriate.

Gary


On Fri, Jun 13, 2014 at 12:05 PM, Michael Tobias <michael@tobias.org.uk>
wrote:

> I recently joined this list as I have started to examine Apache Solr and
> am extremely interested in using soundex and phonetic tokens.
>
> I have already pointed out some bugs in the current implementation of BMPM
> in the Commons Codec and 1 has already been fixed.
>
> Having checked archived messages relating to the introduction of BMPM I
> see that at the time it was also discussed whether to implement
> Daitch-Mokotoff soundex at the same time.  It looks like this was never
> taken up but I am really interested in having this functionality.
>
> Daitch-Mokotoff is a much more simple algorithm than BMPM (though it can
> 'branch' and produce multiple tokens for the same word). It uses a rules
> table along with a very few additional instructions. The algorithm is in
> the public Domain and there are various implementations available
> (including a few apparently written in java but I am not convinced they are
> correct). If it is felt necessary I can get written permission from Gary
> Mokotoff and Randy Daitch to allow the algorithm to be used.
>
> I am currently discussing some changes to the algorithm with Gary Mokotoff
> and hope to have them agreed shortly.
>
> At that point I will probably have a simple php implementation (not my
> code, but permission to adapt will be granted) which I would be interested
> in having ported to java for inclusion in the Commons Codec.
>
> Is anybody on this list interested in assisting with this and porting an
> agreed php implementation to java?  I will be happy to test all output
> until we are satisfied it is fully functional.
>
> Thanks
>
> Michael
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


-- 
E-Mail: garydgregory@gmail.com | ggregory@apache.org
Java Persistence with Hibernate, Second Edition
<http://www.manning.com/bauer3/>
JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
Spring Batch in Action <http://www.manning.com/templier/>
Blog: http://garygregory.wordpress.com
Home: http://garygregory.com/
Tweet! http://twitter.com/GaryGregory

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message