hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Mackrory (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15889) String case conversions are locale-sensitive, used without locale
Date Wed, 25 May 2016 18:11:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300560#comment-15300560

Sean Mackrory commented on HBASE-15889:

[~busbey] I'd love to - running through unit tests with it right now, actually...

I actually started using Locale.ENGLISH as that's gotten recommended in several places as
a locale with well-defined / correct behavior in the common charsets used in programming (e.g.
https://issues.apache.org/jira/browse/FILEUPLOAD-229). But Locale.ROOT sounds good to me,
too, and the docs actually call it out as the neutral locale for this purpose, so I'm good
with that too.

> String case conversions are locale-sensitive, used without locale
> -----------------------------------------------------------------
>                 Key: HBASE-15889
>                 URL: https://issues.apache.org/jira/browse/HBASE-15889
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.2.0
>            Reporter: Sean Mackrory
>            Priority: Minor
> Static code analysis is flagging cases of String.toLowerCase and String.toUpperCase being
used without Locale. From the API reference:
> {quote}
> Note: This method is locale sensitive, and may produce unexpected results if used for
strings that are intended to be interpreted locale independently. Examples are programming
language identifiers, protocol keys, and HTML tags. For instance, "TITLE".toLowerCase() in
a Turkish locale returns "t\u0131tle", where '\u0131' is the LATIN SMALL LETTER DOTLESS I
character. To obtain correct results for locale insensitive strings, use toLowerCase(Locale.ROOT).
> {quote}
> Many uses of these functions do appear to be looking up classes, etc. and not dealing
with stored data, so I'd think there aren't significant compatibility problems here and specifying
the locale is indeed the safer way to go.

This message was sent by Atlassian JIRA

View raw message