commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benedikt Ritter <>
Subject Re: [TEXT] How do we want to handle case conversions?
Date Sun, 21 May 2017 14:47:40 GMT

> Am 21.05.2017 um 08:06 schrieb Duncan Jones <>:
> Hi everyone,
> I’ve found some time to continue breaking WordUtils into separate classes (eschewing
the “big collection of static methods” approach). However, as I read more about case handling
in Unicode, I realise how simplistic the WordUtils methods are and how complex a full solution
would need to be.
> Section 5.18 of the Unicode specification [1] describes these complexities. The mains
ones that bother me are:
> 1. Title case conversions vary widely between different locales and languages. I’m
not clear whether any locale is satisfied by the current simplistic implementation in WordUtils.capitalize(str).
Supporting this correctly would be a serious challenge.
> 2. All types of case conversion may vary depending upon context/locale. There are examples
provided in [1] where the outcome is different in a Turkish locale or if the letter in question
is followed by another or not.
> Does anyone have a suggestion for how to move forward with this work? I see three options:
1] Admit defeat and avoid the case conversion mess entirely. 2] Mimic the existing functionality,
but document the limitations. 3] Attempt to deliver a locale-dependent version, perhaps still
with limitations (or for certain languages).
> I’m leaning towards 2, perhaps even calling the classes “SimpleX…”.

Sounds good to me.

> Thanks,
> Duncan
> [1]
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message