commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henri Yandell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LANG-882) LookupTranslator accepts CharSequence as input, but fails to work with implementations other than String
Date Tue, 23 Apr 2013 06:01:21 GMT

    [ https://issues.apache.org/jira/browse/LANG-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13638816#comment-13638816
] 

Henri Yandell commented on LANG-882:
------------------------------------

Test is easy - take the current LookupTranslator test and make a StringBuffer version. 

Solutions; naively throwing in a TreeMap doesn't work. A ClassCast occurs between StringBuffer
and String. This is because calling subSequence on StringBuffer returns a String (boo!), and
for some reason the call to compareTo in getEntry of TreeMap doesn't like the different types.
Presumably this could be solved with a custom comparator.

Changing the key of the HashMap to be a String resolves the issue. It feels weird for the
key to be typed; ie) if it was StringBuffer("foo"), I'd expect it to match the String "foo"
as well. Only matching the type of the input seems odd. I can see value in keeping the translate-to
part of the system as CharSequence; you could have large items of text that won't be read
until such a time as they need to be obtained.


                
> LookupTranslator accepts CharSequence as input, but fails to work with implementations
other than String
> --------------------------------------------------------------------------------------------------------
>
>                 Key: LANG-882
>                 URL: https://issues.apache.org/jira/browse/LANG-882
>             Project: Commons Lang
>          Issue Type: Bug
>          Components: lang.text.translate.*
>    Affects Versions: 3.1
>            Reporter: Mark A. Ziesemer
>             Fix For: 3.2
>
>
> The core of {{org.apache.commons.lang3.text.translate}} is a {{HashMap<CharSequence,
CharSequence> lookupMap}}.
> From the Javadoc of {{CharSequence}} (emphasis mine):
> {quote}
> This interface does not refine the general contracts of the equals and hashCode methods.
The result of comparing two objects that implement CharSequence is therefore, in general,
undefined. Each object may be implemented by a different class, and there is no guarantee
that each class will be capable of testing its instances for equality with those of the other.
*It is therefore inappropriate to use arbitrary CharSequence instances as elements in a set
or as keys in a map.*
> {quote}
> The current implementation causes code such as the following to not work as expected:
> {code}
> CharSequence cs1 = "1 < 2";
> CharSequence cs2 = CharBuffer.wrap("1 < 2".toCharArray());
> System.out.println(StringEscapeUtils.ESCAPE_HTML4.translate(cs1));
> System.out.println(StringEscapeUtils.ESCAPE_HTML4.translate(cs2));
> {code}
> ... which gives the following results (but should be identical):
> {noformat}
> 1 &lt; 2
> 1 < 2
> {noformat}
> The problem, at a minimum, is that {{CharBuffer.equals}} is even documented in the Javadoc
that:
> {quote}
> A char buffer is not equal to any other type of object.
> {quote}
> ... so a lookup on a CharBuffer in the Map will always fail when compared against the
String implementations that it contains.
> An obvious work-around is to instead use something along the lines of either of the following:
> {code}
> System.out.println(StringEscapeUtils.ESCAPE_HTML4.translate(cs2.toString()));
> System.out.println(StringEscapeUtils.escapeHtml4(cs2.toString()));
> {code}
> ... which forces everything back to a {{String}}.  However, this is not practical when
working with large sets of data, which would require significant heap allocations and garbage
collection concerns.  (As such, I was actually trying to use the {{translate}} method that
outputs to a {{Writer}} - but simplified the above examples to omit this.)
> Another option that I'm considering is to use a custom {{CharSequence}} wrapper around
a {{char[]}} that implements {{hashCode()}} and {{equals()}} to work with those implemented
on {{String}}.  (However, this will be interesting due to the symmetric assumption - which
is further interesting that {{String.equals}} is currently implemented using {{instanceof}}
- even though {{String}} is {{final}}...)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message