commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henri Yandell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LANG-710) StringIndexOutOfBoundsException when calling unescapeHtml4("&#03")
Date Thu, 07 Jul 2011 03:45:17 GMT

    [ https://issues.apache.org/jira/browse/LANG-710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061027#comment-13061027
] 

Henri Yandell commented on LANG-710:
------------------------------------

So the basic issue imo is that ParseException is a typed exception - we'd have to introduce
it to the StringEscapeUtils API.

I'm uncomfortable throwing a random IllegalArgumentException (or similar) when the bad data
is passed in. That may be the typed-exception fan in me speaking. I don't like discovering
at 4am that someone found a piece of data that caused a heretofore unknown runtime exception
to occur.

So we have three options:

1: Leave the data unescaped because it is poorly typed.
2: Claim that we're dealing with XHTML and throw an exception.
3: Escape the data.

All the options seem useful, but none of them seem perfect. So I've implemented all three.

svn ci -m "Making unescapeHtml _NOT_ escape unfinished numeric entities by default (it ignores
them); however adding options that will fire an exception or unescape the numeric entity.
LANG-710"
Sending        src/main/java/org/apache/commons/lang3/text/translate/NumericEntityUnescaper.java
Sending        src/test/java/org/apache/commons/lang3/text/translate/NumericEntityUnescaperTest.java
Transmitting file data ..
Committed revision 1143641.


> StringIndexOutOfBoundsException when calling unescapeHtml4("&#03")
> ------------------------------------------------------------------
>
>                 Key: LANG-710
>                 URL: https://issues.apache.org/jira/browse/LANG-710
>             Project: Commons Lang
>          Issue Type: Bug
>          Components: lang.*
>    Affects Versions: 3.0
>         Environment: java version "1.6.0_24"
> Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
> Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
>            Reporter: Benjamin Valentin
>            Assignee: Henri Yandell
>            Priority: Minor
>              Labels: StringEscapeUtils, StringUtils
>             Fix For: 3.0
>
>
> When calling unescapeHtml4() on the String "&#03" (or any String that contains these
characters) an Exception is thrown:
> Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out
of range: 4
> 	at java.lang.String.charAt(String.java:686)
> 	at org.apache.commons.lang3.text.translate.NumericEntityUnescaper.translate(NumericEntityUnescaper.java:49)
> 	at org.apache.commons.lang3.text.translate.AggregateTranslator.translate(AggregateTranslator.java:53)
> 	at org.apache.commons.lang3.text.translate.CharSequenceTranslator.translate(CharSequenceTranslator.java:88)
> 	at org.apache.commons.lang3.text.translate.CharSequenceTranslator.translate(CharSequenceTranslator.java:60)
> 	at org.apache.commons.lang3.StringEscapeUtils.unescapeHtml4(StringEscapeUtils.java:351)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message