commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebb (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (CODEC-233) Soundex should support more algorithm variants
Date Sun, 02 Apr 2017 20:45:42 GMT

     [ https://issues.apache.org/jira/browse/CODEC-233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sebb resolved CODEC-233.
------------------------
       Resolution: Fixed
    Fix Version/s: 1.11

The special case processing of HW can be overridden in a new constructor and/or the presence
of any SILENT code characters. This avoids needing to add a spurious mapping char.

URL: http://svn.apache.org/viewvc?rev=1789911&view=rev
Log:
CODEC-233 Soundex should support more algorithm variants

Modified:
    commons/proper/codec/trunk/src/changes/changes.xml
    commons/proper/codec/trunk/src/main/java/org/apache/commons/codec/language/Soundex.java
    commons/proper/codec/trunk/src/test/java/org/apache/commons/codec/language/SoundexTest.java


> Soundex should support more algorithm variants
> ----------------------------------------------
>
>                 Key: CODEC-233
>                 URL: https://issues.apache.org/jira/browse/CODEC-233
>             Project: Commons Codec
>          Issue Type: New Feature
>            Reporter: Sebb
>             Fix For: 1.11
>
>
> The existing Soundex class was designed around the American Soundex algorithm.
> Whilst it offers some flexibility with the mapping of letters to Soundex numbers, the
list of the 'silent' letters H and W is built-in to the code. There is no provision for changing
the set of silent (ignored) letters.
> There is also no way to change the designation of HW from silent into consonant separator
- i.e. code 0 - because that is how HW are currently encoded in the public API.
> To fix this, the mapping can be enhanced to support an extra code for 'silent' letters.
> A mapping which includes such a code did not have defined behaviour previously, so can
be treated differently - there is no need to assume HW are silent.
> This allows for the definition of alternative silent letters.
> It can also be used to map HW as code '0' - as long as there is at least one 'silent'
code. 
> If there are no actual silent letters in the algorithm variant, then the code can be
appended to the end of the mapping. This will not affect processing as only letters A-Z are
passed to the method. 
> An alternative would be to introduce yet another code as an alias for '0', and only treat
HW as silent if they have code '0'.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message