commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dominik Strecker (JIRA)" <>
Subject [jira] [Created] (LANG-1343) StringUtils#abbreviate breaks up surrogate pairs
Date Thu, 29 Jun 2017 09:45:00 GMT
Dominik Strecker created LANG-1343:

             Summary: StringUtils#abbreviate breaks up surrogate pairs
                 Key: LANG-1343
             Project: Commons Lang
          Issue Type: Bug
          Components: lang.*
    Affects Versions: 3.6
            Reporter: Dominik Strecker
            Priority: Minor

If the last char in the remaining substring is the first char of a surrogate pair, the resulting
string has an illegal surrogate pair with the second char of the surrogate pair being the
first char of the ellipsis.

StringUtils.abbreviate("\uD83D\uDCA9\uD83D\uDCA9\uD83D\uDCA9", 4); // returns "\uD83D..."

In my case this breaks further along when the string is transformed to UTF-8 for a SOAP request.

Should this at least be mentioned in the Javadoc?

This message was sent by Atlassian JIRA

View raw message