Return-Path: Delivered-To: apmail-incubator-harmony-dev-archive@www.apache.org Received: (qmail 13859 invoked from network); 20 Feb 2006 16:52:12 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 20 Feb 2006 16:52:12 -0000 Received: (qmail 67883 invoked by uid 500); 20 Feb 2006 16:52:01 -0000 Delivered-To: apmail-incubator-harmony-dev-archive@incubator.apache.org Received: (qmail 67825 invoked by uid 500); 20 Feb 2006 16:52:01 -0000 Mailing-List: contact harmony-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: harmony-dev@incubator.apache.org Delivered-To: mailing list harmony-dev@incubator.apache.org Received: (qmail 67814 invoked by uid 99); 20 Feb 2006 16:52:01 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Feb 2006 08:52:01 -0800 X-ASF-Spam-Status: No, hits=2.6 required=10.0 tests=RCVD_IN_SORBS_WEB,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: 217.158.94.220 is neither permitted nor denied by domain of t.p.ellison@gmail.com) Received: from [217.158.94.220] (HELO cirrus.purplecloud.com) (217.158.94.220) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Feb 2006 08:52:00 -0800 Received: (qmail 7878 invoked from network); 20 Feb 2006 16:51:38 +0000 Received: from blueice1n1.uk.ibm.com (HELO ?9.20.183.163?) (195.212.29.67) by smtp.purplecloud.net with (DHE-RSA-AES256-SHA encrypted) SMTP; 20 Feb 2006 16:51:38 +0000 Message-ID: <43F9F398.20701@gmail.com> Date: Mon, 20 Feb 2006 16:51:36 +0000 From: Tim Ellison User-Agent: Thunderbird 1.5 (Windows/20051201) MIME-Version: 1.0 To: harmony-dev@incubator.apache.org Subject: Re: [jira] Commented: (HARMONY-68) java.nio.charset.Charset.isSupported(String charsetName) does not throw IllegalCharsetNameException for spoiled standard sharset name References: <1910439128.1140181527320.JavaMail.jira@ajax.apache.org> <43F5F60A.2030707@gmail.com> <43F607B6.6000001@gmail.com> <43F70BD7.5080604@gmail.com> <43F73ECC.2070307@gmail.com> In-Reply-To: <43F73ECC.2070307@gmail.com> X-Enigmail-Version: 0.94.0.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Thanks, I had initially read over the additional restriction on the first character. This strikes me as one of those cases where the reference impl. wins over the specification. I think Svetlana's test was written to the spec. If we discover an app that relies upon isSupported throwing an IllegalCharsetNameException instead of returning false then (besides wondering where this app has ever run) we can revisit. I vote we resolve this part of the bug as "won't fix". Regards, Tim karan malhi wrote: > Here is text from the j2se1.4.2 spec > A charset name must begin with either a letter or a digit. The empty > string is not a legal charset name. Charset names are not > case-sensitive; that is, case is always ignored when comparing charset > names. Charset names generally follow the conventions documented in /RFC > 2278: IANA Charset Registration Procedures/ > . > According to RFC - 2278 > > Finally, charsets being registered for use with the "text" media type > MUST have a primary name that conforms to the more restrictive syntax > of the charset field in MIME encoded-words [RFC-2047, RFC-2184] and > MIME extended parameter values [RFC-2184]. A combined ABNF definition > for such names is as follows: > > mime-charset = 1* > > cspecials = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / " > <"> / "/" / "[" / "]" / "?" / "." / "=" / "*" > > CHAR = ; ( 0-177, 0.-127.) > SPACE = ; ( 40, 32.) > CTL = character and DEL> ; ( 177, 127.) > > If I have interpreted the above correctly, then it basically means that > the name can start with any ASCII character except ASCII (octal) 40, > 0-37, 177. A "-" is 055 and an "_" is 137 which does not fall under the > above exclude list. > So primarily if I have a charset named "-UTF-8" or "_UTF-8", it is not > an illegal name. > > So looks like the spec definition is further tightening the Charsets > accepted by java in that the name can only start with a letter or a > digit. How do we interpret *must* ? > > > > So > > Richard Liang wrote: > >> Hello Tim, >> >> I'm wondering why I did not just copy the first sentence. :-) >> >> "A charset name **must** begin with either a letter or a digit." Does >> this mean if the charset name which begin with neither a letter nor a >> digit should be regarded as an illegal charset name? >> >> >> Richard Liang >> China Software Development Lab, IBM >> >> >> >> Tim Ellison wrote: >> >>> Richard Liang wrote: >>> >>> >>>> Hello Tim, >>>> >>>> I think this is caused by different understanding of the java spec: >>>> >>>> A charset name **must** begin with either a letter or a digit. The >>>> empty >>>> string is not a legal charset name.... >>>> >>>> What do think the implication of "must" here? :-) >>>> >>> >>> >>> But the name isn't empty, it is "-UTF-8" ? I must be missing >>> something... >>> >>> Regards, >>> Tim >>> >>> >>> >>> >>>> Tim Ellison (JIRA) wrote: >>>> >>>>> [ >>>>> http://issues.apache.org/jira/browse/HARMONY-68?page=comments#action_12366784 >>>>> >>>>> ] >>>>> Tim Ellison commented on HARMONY-68: >>>>> ------------------------------------ >>>>> >>>>> The test looks invalid to me. You shoud only expect an >>>>> java.nio.charset.IllegalCharsetNameException if the name itself >>>>> contains disallowed characters, and both underscore and dash are >>>>> permitted. >>>>> >>>>> The code Charset.isSupported("-UTF-8") >>>>> >>>>> should return false, not throw an exception. >>>>> >>>>> >>>>> >>>>>> java.nio.charset.Charset.isSupported(String charsetName) does not >>>>>> throw IllegalCharsetNameException for spoiled standard sharset name >>>>>> ------------------------------------------------------------------------------------------------------------------------------------- >>>>>> >>>>>> >>>>>> >>>>>> Key: HARMONY-68 >>>>>> URL: http://issues.apache.org/jira/browse/HARMONY-68 >>>>>> Project: Harmony >>>>>> Type: Bug >>>>>> Components: Classlib >>>>>> Reporter: Svetlana Samoilenko >>>>>> Attachments: charset_patch.txt >>>>>> >>>>>> According to j2se 1.4.2 specification for Charset.isSupported(String >>>>>> charsetName) the method must throw IllegalCharsetNameException "if >>>>>> the given charset name is illegal ". "Legal charset name must begin >>>>>> with either a letter or a digit. The test listed below shows that >>>>>> there is no the exception if to insert "-" or "_" symbols before >>>>>> standard sharset name, for example "-UTF-8" or "_US-ASCII". >>>>>> Moreover the method returns "true" in this case. >>>>>> BEA also does not throw the exception but returns "false". >>>>>> Code to reproduce: import java.nio.charset.*; public class test2 >>>>>> { public static void main (String[] args) { >>>>>> // string starts neither a letter nor a digit boolean >>>>>> sup=false; try{ >>>>>> sup=Charset.isSupported("-UTF-8"); >>>>>> System.out.println("***BAD. should be exception; >>>>>> sup="+sup); sup=Charset.isSupported("_US-ASCII"); >>>>>> System.out.println("***BAD. should be exception; >>>>>> sup="+sup); } catch (IllegalCharsetNameException e) { >>>>>> System.out.println("***OK. Expected >>>>>> IllegalCharsetNameException " + e); } } } Steps to >>>>>> Reproduce: 1. Build Harmony (check-out on 2006-01-30) j2se subset as >>>>>> described in README.txt. 2. Compile test2.java using BEA 1.4 >>>>>> javac >>>>>>> javac -d . test2.java >>>>>> >>>>>> 3. Run java using compatible VM (J9) >>>>>>> java -showversion test2 >>>>>> >>>>>> Output: C:\tmp>C:\jrockit-j2sdk1.4.2_04\bin\java.exe -showversion >>>>>> test2 java version "1.4.2_04" Java(TM) 2 Runtime Environment, >>>>>> Standard Edition (build 1.4.2_04-b05) BEA WebLogic JRockit(TM) >>>>>> 1.4.2_04 JVM (build ari-31788-20040616-1132-win-ia32, Native Threads, >>>>>> GC strategy: parallel) ***BAD. should be exception; sup=false >>>>>> ***BAD. should be exception; sup=false >>>>>> C:\tmp>C:\harmony\trunk\deploy\jre\bin\java -showversion test2 (c) >>>>>> Copyright 1991, 2005 The Apache Software Foundation or its licensors, >>>>>> as applicable. ***BAD. should be exception; sup=true >>>>>> ***BAD. should be exception; sup=true >>>>>> Suggested junit test case: >>>>>> ------------------------ CharserTest.java >>>>>> ------------------------------------------------- import >>>>>> java.nio.charset.*; import junit.framework.*; public class >>>>>> CharsetTest extends TestCase { public static void main(String[] >>>>>> args) { junit.textui.TestRunner.run(CharsetTest.class); } >>>>>> public void test_isSupported() { boolean >>>>>> sup=false; // string starts neither a letter nor a >>>>>> digit try{ >>>>>> sup=Charset.isSupported("-UTF-8"); >>>>>> fail("***BAD. should be exception >>>>>> IllegalCharsetNameException"); } catch >>>>>> (IllegalCharsetNameException e) { //expected >>>>>> } >>>>>> // string starts neither a letter nor a digit try{ >>>>>> sup=Charset.isSupported("_US-ASCII"); >>>>>> fail("***BAD. should be exception >>>>>> IllegalCharsetNameException"); } catch >>>>>> (IllegalCharsetNameException e) { //expected >>>>>> } >>>>>> } } >>>>>> >>>>> >>>>> >>>> >>> >>> >> >> > -- Tim Ellison (t.p.ellison@gmail.com) IBM Java technology centre, UK.