Return-Path: Delivered-To: apmail-commons-issues-archive@minotaur.apache.org Received: (qmail 65443 invoked from network); 17 Jan 2011 05:52:10 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 17 Jan 2011 05:52:10 -0000 Received: (qmail 11073 invoked by uid 500); 17 Jan 2011 05:52:10 -0000 Delivered-To: apmail-commons-issues-archive@commons.apache.org Received: (qmail 10849 invoked by uid 500); 17 Jan 2011 05:52:08 -0000 Mailing-List: contact issues-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: issues@commons.apache.org Delivered-To: mailing list issues@commons.apache.org Received: (qmail 10840 invoked by uid 99); 17 Jan 2011 05:52:07 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Jan 2011 05:52:07 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Jan 2011 05:52:05 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id p0H5ph8S007338 for ; Mon, 17 Jan 2011 05:51:44 GMT Message-ID: <18497423.7431295243503865.JavaMail.jira@thor> Date: Mon, 17 Jan 2011 00:51:43 -0500 (EST) From: "Henri Yandell (JIRA)" To: issues@commons.apache.org Subject: [jira] Updated: (LANG-607) StringUtils methods do not handle Unicode 2.0+ supplementary characters correctly. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LANG-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Henri Yandell updated LANG-607: ------------------------------- Moving to 3.1 as not a backwards incompatibility. > StringUtils methods do not handle Unicode 2.0+ supplementary characters correctly. > ---------------------------------------------------------------------------------- > > Key: LANG-607 > URL: https://issues.apache.org/jira/browse/LANG-607 > Project: Commons Lang > Issue Type: Bug > Components: lang.* > Affects Versions: 2.5 > Environment: java version "1.6.0_16" > Java(TM) SE Runtime Environment (build 1.6.0_16-b01) > Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode) > Microsoft Windows [Version 6.0.6002] > Apache Maven 2.2.1 (r801777; 2009-08-06 12:16:01-0700) > Java version: 1.6.0_16 > Java home: C:\Program Files\Java\jdk1.6.0_16\jre > Default locale: en_US, platform encoding: Cp1252 > OS name: "windows vista" version: "6.0" arch: "amd64" Family: "windows" > Reporter: Gary Gregory > Assignee: Gary Gregory > Priority: Minor > Fix For: 3.1 > > Attachments: LANG-607.diff > > > StringUtils.containsAny methods incorrectly matches Unicode 2.0+ supplementary characters. > For example, define a test fixture to be the Unicode character U+20000 where U+20000 is written in Java source as "\uD840\uDC00" > private static final String CharU20000 = "\uD840\uDC00"; > private static final String CharU20001 = "\uD840\uDC01"; > You can see Unicode supplementary characters correctly implemented in the JRE call: > assertEquals(-1, CharU20000.indexOf(CharU20001)); > But this is broken: > assertEquals(false, StringUtils.containsAny(CharU20000, CharU20001)); > assertEquals(false, StringUtils.containsAny(CharU20001, CharU20000)); > This is fine: > assertEquals(true, StringUtils.contains(CharU20000 + CharU20001, CharU20000)); > assertEquals(true, StringUtils.contains(CharU20000 + CharU20001, CharU20001)); > assertEquals(true, StringUtils.contains(CharU20000, CharU20000)); > assertEquals(false, StringUtils.contains(CharU20000, CharU20001)); > because the method calls the JRE to perform the match. > More than you want to know: > - http://java.sun.com/developer/technicalArticles/Intl/Supplementary/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.