commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henri Yandell (JIRA)" <>
Subject [jira] [Commented] (LANG-710) StringIndexOutOfBoundsException when calling unescapeHtml4("&#03")
Date Thu, 07 Jul 2011 03:45:17 GMT


Henri Yandell commented on LANG-710:

So the basic issue imo is that ParseException is a typed exception - we'd have to introduce
it to the StringEscapeUtils API.

I'm uncomfortable throwing a random IllegalArgumentException (or similar) when the bad data
is passed in. That may be the typed-exception fan in me speaking. I don't like discovering
at 4am that someone found a piece of data that caused a heretofore unknown runtime exception
to occur.

So we have three options:

1: Leave the data unescaped because it is poorly typed.
2: Claim that we're dealing with XHTML and throw an exception.
3: Escape the data.

All the options seem useful, but none of them seem perfect. So I've implemented all three.

svn ci -m "Making unescapeHtml _NOT_ escape unfinished numeric entities by default (it ignores
them); however adding options that will fire an exception or unescape the numeric entity.
Sending        src/main/java/org/apache/commons/lang3/text/translate/
Sending        src/test/java/org/apache/commons/lang3/text/translate/
Transmitting file data ..
Committed revision 1143641.

> StringIndexOutOfBoundsException when calling unescapeHtml4("&#03")
> ------------------------------------------------------------------
>                 Key: LANG-710
>                 URL:
>             Project: Commons Lang
>          Issue Type: Bug
>          Components: lang.*
>    Affects Versions: 3.0
>         Environment: java version "1.6.0_24"
> Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
> Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
>            Reporter: Benjamin Valentin
>            Assignee: Henri Yandell
>            Priority: Minor
>              Labels: StringEscapeUtils, StringUtils
>             Fix For: 3.0
> When calling unescapeHtml4() on the String "&#03" (or any String that contains these
characters) an Exception is thrown:
> Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out
of range: 4
> 	at java.lang.String.charAt(
> 	at org.apache.commons.lang3.text.translate.NumericEntityUnescaper.translate(
> 	at org.apache.commons.lang3.text.translate.AggregateTranslator.translate(
> 	at org.apache.commons.lang3.text.translate.CharSequenceTranslator.translate(
> 	at org.apache.commons.lang3.text.translate.CharSequenceTranslator.translate(
> 	at org.apache.commons.lang3.StringEscapeUtils.unescapeHtml4(

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message