commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edelson, Justin" <Justin.Edel...@mtvi.com>
Subject RE: [codec] Soudex issue with accented character.
Date Wed, 02 Jun 2004 15:25:41 GMT
Agreed, but that only addresses vowels, not (for example) N with tilde
or C with cedilla.

> -----Original Message-----
> From: C. Scott Ananian [mailto:cscott@cscott.net] 
> Sent: Wednesday, June 02, 2004 11:19 AM
> To: Jakarta Commons Developers List
> Subject: RE: [codec] Soudex issue with accented character.
> 
> 
> On Wed, 2 Jun 2004, Edelson, Justin wrote:
> 
> > That's not the behavior either in the latest [codec] 
> release or HEAD. 
> > Can you clarify where this 'standard' behavior you describe is 
> > documented? Neither the National Archives documentation nor 
> the NIST 
> > source code contain this behavior.
> 
> Um, google's "lucky" search for soundex returns
>   http://www.archives.gov/research_room/genealogy/census/soundex.html
> which contains the text:
>  "Disregard the letters A, E, I, O, U, H, W, and Y".
> 
> Disregarding &eacute; (for example) seems a completely 
> reasonable thing to do.  --scott
> 
> tonight NSA Moscow Castro non-violent protest Philadelphia 
> Nader Justice C4 SLBM Sudan security Albanian plutonium SSBN 
> 743 blowfish Japan
>                          ( http://cscott.net/ )
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message