commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Gregory (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LANG-858) StringEscapeUtils.escapeJava() does not output the escaped surrogate pairs that is Java parsable
Date Thu, 22 Nov 2012 02:24:58 GMT

    [ https://issues.apache.org/jira/browse/LANG-858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13502541#comment-13502541
] 

Gary Gregory commented on LANG-858:
-----------------------------------

I've added some more tests and @Ignore'd the failing ones. 

escapeJava() should behave correctly, we need to see how to make that work under the hood
without loosing our current flexibility and making the whole escaping process Java-specific.
                
> StringEscapeUtils.escapeJava() does not output the escaped surrogate pairs that is Java
parsable
> ------------------------------------------------------------------------------------------------
>
>                 Key: LANG-858
>                 URL: https://issues.apache.org/jira/browse/LANG-858
>             Project: Commons Lang
>          Issue Type: Bug
>          Components: lang.*, lang.text.translate.*
>    Affects Versions: 3.x
>            Reporter: Kazuki Hamasaki
>            Priority: Minor
>              Labels: escaping
>         Attachments: JavaUnicodeEscape.patch
>
>
> In case of Java and ECMA Script, the style of unicode escape {{'\uxxxxxx'}} cannot be
accepted. We need to separate it into high-surrogate and low-surrogate.
> For example, you put the surrogate pair
> {code:java}
> '\uDBFF\uDFFD'
> {code}
> output must be
> {code:java}
> "\\uDBFF\\uDFFD"
> {code}
> However you get
> {code:java}
> "\\u10FFFD"
> {code}
> Test case here:
> {code:java}
> @Test
> public void testEscapeSurrogatePairs() throws Exception {
>     assertEquals("\\uDBFF\\uDFFD", StringEscapeUtils.escapeJava("\uDBFF\uDFFD"));
>     assertEquals("\\uDBFF\\uDFFD", StringEscapeUtils.escapeEcmaScript("\uDBFF\uDFFD"));
> }
> {code}
> I attached the patch which implements simple solution.
> But UnicodeEscaper.java should not be specified for Java, I think. We need to discuss
about it.
> This issue does not be appeared in unescape method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message