commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Benedict (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LANG-285) Wish : method unaccent
Date Thu, 28 Feb 2008 03:12:51 GMT

    [ https://issues.apache.org/jira/browse/LANG-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573138#action_12573138
] 

Paul Benedict commented on LANG-285:
------------------------------------

+1 for the Collator approach. Accented characters are usually for Roman-derived languages
anyway. Perhaps this method can be generalized to perform any type of transformation with
a given Collator, so it doesn't depend on Western semantics.

> Wish : method unaccent
> ----------------------
>
>                 Key: LANG-285
>                 URL: https://issues.apache.org/jira/browse/LANG-285
>             Project: Commons Lang
>          Issue Type: New Feature
>            Reporter: Guillaume Coté
>            Priority: Minor
>             Fix For: 3.0
>
>         Attachments: LANG-285-unaccent-using-Collator.patch, MapBuilder.java, unaccent.patch,
UnnacentMap.java
>
>
> I would like to add a method that replace accented caracter by unaccented one.  For example,
with the input String "L'été où j'ai dû aller à l'île d'Anticosti commenca tôt", the
method would return "L'ete ou j'ai du aller à l'ile d'Anticosti commenca tot".
> I suggest to call that method unaccent and to add it in StringUtils.
> If we cannot covert all case, the first version could only covert iso-8859-1.
> If you are willing to go forward with that idea, I am willing to contribute a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message