Return-Path: Delivered-To: apmail-harmony-dev-archive@www.apache.org Received: (qmail 44646 invoked from network); 24 Sep 2010 05:47:15 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 24 Sep 2010 05:47:15 -0000 Received: (qmail 19573 invoked by uid 500); 24 Sep 2010 05:47:15 -0000 Delivered-To: apmail-harmony-dev-archive@harmony.apache.org Received: (qmail 19131 invoked by uid 500); 24 Sep 2010 05:47:11 -0000 Mailing-List: contact dev-help@harmony.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@harmony.apache.org Delivered-To: mailing list dev@harmony.apache.org Received: (qmail 19116 invoked by uid 99); 24 Sep 2010 05:47:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Sep 2010 05:47:10 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of rcmuir@gmail.com designates 209.85.161.49 as permitted sender) Received: from [209.85.161.49] (HELO mail-fx0-f49.google.com) (209.85.161.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Sep 2010 05:47:05 +0000 Received: by fxm4 with SMTP id 4so1985461fxm.36 for ; Thu, 23 Sep 2010 22:46:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type; bh=j+i5Lv2TWo3CT/h6v3T/hQRdye0OZVlw4I6wE1JWuHc=; b=lD4J3RbfRcuBi0fO/1zr14p33W2q+RNHKaukOLMiB/Y8ioJfcUWFdDuYMH580999Rk fFaeIRxeLuAtn7U7zcFlCpcGEIW/AgzancffGwSAyafDz9lXQPch/5infMnzQkS0NtUU Lrdp/cg+vuf0h4eH1f+r2c+C1n/OyLD6xqgBo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=SgJYJiWMlyk42s4aAh5bF5ZCq584/4chPjgvBb9hcWbZLRIaKuoZ+0Y2l0uhB95RYD AmWNrEkyzNJyJrsFHTH9jPgJ29cRtUPnvkquFx4Nk4ym9LazhpZ/+5JbzYT9bLAA04tE BSJLQlgsLnLVYUhH5YpzMazD1+ybtDBZy/+NM= Received: by 10.223.125.145 with SMTP id y17mr2950042far.84.1285307204127; Thu, 23 Sep 2010 22:46:44 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.106.19 with HTTP; Thu, 23 Sep 2010 22:46:24 -0700 (PDT) In-Reply-To: References: <16508996.357041285200633873.JavaMail.jira@thor> <4C9ABC65.6000202@gmail.com> <4C9C0E0B.6080703@gmail.com> <4C9C24C1.6030809@gmail.com> <4C9C31BD.2010504@gmail.com> From: Robert Muir Date: Fri, 24 Sep 2010 01:46:24 -0400 Message-ID: Subject: Re: [classlib][luni] String.toLowerCase/toUpperCase incorrect for supplementary characters (HARMONY-6649) To: dev@harmony.apache.org Content-Type: multipart/alternative; boundary=0016368e1f82544ccf0490fae759 --0016368e1f82544ccf0490fae759 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Fri, Sep 24, 2010 at 1:13 AM, Robert Muir wrote: > > but case-sensitive filenames (such as windows) don't use locale-dependent > comparisons? > they implement locale-independent case-folding. for example if i have a > file "=CF=83.txt", I cannot create "=CF=82.txt". (I just tried) > Both of these files are already in lowercase. > > The interesting question is: how does hashCode() relate? Because a hashco= de > based upon String.toLowerCase(Locale.ENGLISH) would return different > hashcodes for these two filenames, but with UCharacter.foldCase(), it wou= ld > be the same. > i did more tests here against windows, and looking at the api link you provided: http://download.oracle.com/javase/6/docs/api/java/io/File.html#hashCode%28%= 29 windows appears to implement something closer to 'simple case folding': http://icu-project.org/apiref/icu4j/com/ibm/icu/lang/UCharacter.html#foldCa= se(int, boolean), not the 'full case folding'. For example I can create "ss.txt" an= d "=C3=9F.txt" windows also doesn't lowercase any supplementary characters. In other words, its case comparison acts exactly like String.equalsIgnoreCase(). So, for File.hashCode(), a hashcode computed wit= h Character.toLowerCase(Character.toUpperCase(char)) would be completely consistent with its case sensitivity. --=20 Robert Muir rcmuir@gmail.com --0016368e1f82544ccf0490fae759--