commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebb (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CODEC-199) Bug in HW rule in Soundex
Date Thu, 30 Mar 2017 12:54:42 GMT

    [ https://issues.apache.org/jira/browse/CODEC-199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949007#comment-15949007
] 

Sebb commented on CODEC-199:
----------------------------

Also code may have its own mapping string and expect the code to deal with HW.

I think it would be better to revert to the original patch, which fixes the HW behaviour in
the code.

This makes the fix less generic, but AFAICT the Soundex algorithm is specific to English language
names.

> Bug in HW rule in Soundex
> -------------------------
>
>                 Key: CODEC-199
>                 URL: https://issues.apache.org/jira/browse/CODEC-199
>             Project: Commons Codec
>          Issue Type: Bug
>    Affects Versions: 1.10
>            Reporter: Yossi Tamari
>             Fix For: 1.11
>
>         Attachments: better.patch, soundex.patch
>
>
> The Soundex algorithm says that if two characters that map to the same code are separated
by H or W, the second one is not encoded.
> However, in the implementation (in Soundex.getMappingCode() line 191), a character that
is preceded by two characters that are either H or W, is not encoded, regardless of what the
last consonant was.
> Source: http://en.wikipedia.org/wiki/Soundex#American_Soundex



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message