commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "C. Scott Ananian" <csc...@cscott.net>
Subject RE: [codec] Soudex issue with accented character.
Date Wed, 02 Jun 2004 15:18:52 GMT
On Wed, 2 Jun 2004, Edelson, Justin wrote:

> That's not the behavior either in the latest [codec] release or HEAD.
> Can you clarify where this 'standard' behavior you describe is
> documented? Neither the National Archives documentation nor the NIST
> source code contain this behavior.

Um, google's "lucky" search for soundex returns
  http://www.archives.gov/research_room/genealogy/census/soundex.html
which contains the text:
 "Disregard the letters A, E, I, O, U, H, W, and Y".

Disregarding &eacute; (for example) seems a completely reasonable thing to
do.
 --scott

tonight NSA Moscow Castro non-violent protest Philadelphia Nader Justice
C4 SLBM Sudan security Albanian plutonium SSBN 743 blowfish Japan
                         ( http://cscott.net/ )

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message