commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Gregory <GGreg...@seagullsoftware.com>
Subject [codec] Testing Cologne Phonetic
Date Sun, 20 Feb 2011 20:02:04 GMT
Hi Franz,

We are adding an implementation of Cologne Phonetic to Apache Commons Codec for version 1.5.

I want to use your data from http://sourceforge.net/projects/familynamephon/ as a test fixture.

I've found differences between the codes generated by our ColognePhonetic class and the data
in the data file.

I'd like to know if this is a bug in our code, the code that generates the test fixture, or
a "feature" of some kind that I am unaware of.

Since I am not a German speaker, some help would be appreciated.

For example, the first two line in the data file are:

1              Aa           a              a                              A             A000
2              Aaaken aken      aken      46           AKN       A250

(Team Codec: The Cologne Phonetic code is in column 4.)

AA generates no code at all, we generate 0
Aaaken maps to 46, , we generate 046

Bug, not a bug?

Thoughts?

Thank you,
Gary Gregory
Senior Software Engineer
Rocket Software
3340 Peachtree Road, Suite 820 * Atlanta, GA 30326 * USA
Tel: +1.404.760.1560
Email: ggregory@seagullsoftware.com<mailto:ggregory@seagullsoftware.com>
Web: seagull.rocketsoftware.com<http://www.seagull.rocketsoftware.com/>



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message