commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebb (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CODEC-199) Bug in HW rule in Soundex
Date Fri, 31 Mar 2017 23:55:41 GMT

    [ https://issues.apache.org/jira/browse/CODEC-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15951828#comment-15951828
] 

Sebb commented on CODEC-199:
----------------------------

I think it makes sense to split this issue into two parts.

1) fixing the bug in the American Soundex algorithm implementation
That is the scope of this issue, CODEC-199

2) enhancing the class to provide support for other variants, i.e. Simplified Soundex and
the Genealogy variant (whatever its name is).
That will now be dealt with under CODEC-233.

> Bug in HW rule in Soundex
> -------------------------
>
>                 Key: CODEC-199
>                 URL: https://issues.apache.org/jira/browse/CODEC-199
>             Project: Commons Codec
>          Issue Type: Bug
>    Affects Versions: 1.10
>            Reporter: Yossi Tamari
>             Fix For: 1.11
>
>         Attachments: better.patch, soundex.patch
>
>
> The Soundex algorithm says that if two characters that map to the same code are separated
by H or W, the second one is not encoded.
> However, in the implementation (in Soundex.getMappingCode() line 191), a character that
is preceded by two characters that are either H or W, is not encoded, regardless of what the
last consonant was.
> Source: http://en.wikipedia.org/wiki/Soundex#American_Soundex



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message