Return-Path: X-Original-To: apmail-commons-issues-archive@minotaur.apache.org Delivered-To: apmail-commons-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 48ACD108C7 for ; Mon, 22 Apr 2013 04:27:21 +0000 (UTC) Received: (qmail 35448 invoked by uid 500); 22 Apr 2013 04:27:18 -0000 Delivered-To: apmail-commons-issues-archive@commons.apache.org Received: (qmail 35106 invoked by uid 500); 22 Apr 2013 04:27:17 -0000 Mailing-List: contact issues-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: issues@commons.apache.org Delivered-To: mailing list issues@commons.apache.org Received: (qmail 35055 invoked by uid 99); 22 Apr 2013 04:27:16 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 22 Apr 2013 04:27:15 +0000 Date: Mon, 22 Apr 2013 04:27:15 +0000 (UTC) From: "Henri Yandell (JIRA)" To: issues@commons.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (LANG-882) LookupTranslator accepts CharSequence as input, but fails to work with implementations other than String MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LANG-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13637764#comment-13637764 ] Henri Yandell commented on LANG-882: ------------------------------------ I've added a note to this in the Javadoc for the moment. Options that jump out to me are to go with calling .toString() on the key and taking any performance hits, or perhaps using TreeMap with a custom CharSequenceComparator if CharSequences are passed in that don't have refined equals(Object)/hashCode() methods and taking a lookup time performance hit. > LookupTranslator accepts CharSequence as input, but fails to work with implementations other than String > -------------------------------------------------------------------------------------------------------- > > Key: LANG-882 > URL: https://issues.apache.org/jira/browse/LANG-882 > Project: Commons Lang > Issue Type: Bug > Components: lang.text.translate.* > Affects Versions: 3.1 > Reporter: Mark A. Ziesemer > Fix For: 3.2 > > > The core of {{org.apache.commons.lang3.text.translate}} is a {{HashMap lookupMap}}. > From the Javadoc of {{CharSequence}} (emphasis mine): > {quote} > This interface does not refine the general contracts of the equals and hashCode methods. The result of comparing two objects that implement CharSequence is therefore, in general, undefined. Each object may be implemented by a different class, and there is no guarantee that each class will be capable of testing its instances for equality with those of the other. *It is therefore inappropriate to use arbitrary CharSequence instances as elements in a set or as keys in a map.* > {quote} > The current implementation causes code such as the following to not work as expected: > {code} > CharSequence cs1 = "1 < 2"; > CharSequence cs2 = CharBuffer.wrap("1 < 2".toCharArray()); > System.out.println(StringEscapeUtils.ESCAPE_HTML4.translate(cs1)); > System.out.println(StringEscapeUtils.ESCAPE_HTML4.translate(cs2)); > {code} > ... which gives the following results (but should be identical): > {noformat} > 1 < 2 > 1 < 2 > {noformat} > The problem, at a minimum, is that {{CharBuffer.equals}} is even documented in the Javadoc that: > {quote} > A char buffer is not equal to any other type of object. > {quote} > ... so a lookup on a CharBuffer in the Map will always fail when compared against the String implementations that it contains. > An obvious work-around is to instead use something along the lines of either of the following: > {code} > System.out.println(StringEscapeUtils.ESCAPE_HTML4.translate(cs2.toString())); > System.out.println(StringEscapeUtils.escapeHtml4(cs2.toString())); > {code} > ... which forces everything back to a {{String}}. However, this is not practical when working with large sets of data, which would require significant heap allocations and garbage collection concerns. (As such, I was actually trying to use the {{translate}} method that outputs to a {{Writer}} - but simplified the above examples to omit this.) > Another option that I'm considering is to use a custom {{CharSequence}} wrapper around a {{char[]}} that implements {{hashCode()}} and {{equals()}} to work with those implemented on {{String}}. (However, this will be interesting due to the symmetric assumption - which is further interesting that {{String.equals}} is currently implemented using {{instanceof}} - even though {{String}} is {{final}}...) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira