commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kazuki Hamasaki (JIRA)" <j...@apache.org>
Subject [jira] [Created] (LANG-857) Bad surrogate pair handling in the CharSequenceTranslator
Date Tue, 20 Nov 2012 12:36:58 GMT
Kazuki Hamasaki created LANG-857:
------------------------------------

             Summary: Bad surrogate pair handling in the CharSequenceTranslator
                 Key: LANG-857
                 URL: https://issues.apache.org/jira/browse/LANG-857
             Project: Commons Lang
          Issue Type: Bug
          Components: lang.text.translate.*
    Affects Versions: 3.x
            Reporter: Kazuki Hamasaki
            Priority: Minor
         Attachments: CharSequenceTranslator_translate.patch

I found that there is bad surrogate pair handling in the CharSequenceTranslator

This is a simple test case for this problem.
\uD83D\uDE30 is a surrogate pair.

{code:java}
@Test
public void testEscapeSurrogatePairs() throws Exception {
    assertEquals("\uD83D\uDE30", StringEscapeUtils.escapeCsv("\uD83D\uDE30"));
}
{code}

You'll get the exception as shown below.

{code}
java.lang.StringIndexOutOfBoundsException: String index out of range: 2
	at java.lang.String.charAt(String.java:658)
	at java.lang.Character.codePointAt(Character.java:4668)
	at org.apache.commons.lang3.text.translate.CharSequenceTranslator.translate(CharSequenceTranslator.java:95)
	at org.apache.commons.lang3.text.translate.CharSequenceTranslator.translate(CharSequenceTranslator.java:59)
	at org.apache.commons.lang3.StringEscapeUtils.escapeCsv(StringEscapeUtils.java:556)
{code}

Patch attached, the method affected:
# public final void translate(CharSequence input, Writer out) throws IOException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message