harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From karan malhi <karan.ma...@gmail.com>
Subject Re: [jira] Commented: (HARMONY-68) java.nio.charset.Charset.isSupported(String charsetName) does not throw IllegalCharsetNameException for spoiled standard sharset name
Date Sat, 18 Feb 2006 15:35:40 GMT
Here is text from the j2se1.4.2 spec
A charset name must begin with either a letter or a digit. The empty 
string is not a legal charset name. Charset names are not 
case-sensitive; that is, case is always ignored when comparing charset 
names. Charset names generally follow the conventions documented in 
/RFC 2278: IANA Charset Registration Procedures/ 
<http://ietf.org/rfc/rfc2278.txt>.
According to RFC - 2278

   Finally, charsets being registered for use with the "text" media type
   MUST have a primary name that conforms to the more restrictive syntax
   of the charset field in MIME encoded-words [RFC-2047, RFC-2184] and
   MIME extended parameter values [RFC-2184]. A combined ABNF definition
   for such names is as follows:

   mime-charset = 1*<Any CHAR except SPACE, CTLs, and cspecials>

   cspecials    = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "
                  <"> / "/" / "[" / "]" / "?" / "." / "=" / "*"

   CHAR         =  <any ASCII character>        ; (  0-177,  0.-127.)
   SPACE        =  <ASCII SP, space>            ; (     40,      32.)
   CTL          =  <any ASCII control           ; (  0- 37,  0.- 31.)
                    character and DEL>          ; (    177,     127.)

If I have interpreted the above correctly, then it basically means that 
the name can start with any ASCII character except ASCII (octal) 40, 
0-37, 177. 
A "-" is 055 and an "_" is 137 which does not fall under the above 
exclude list.
So primarily if I have a charset named "-UTF-8"  or "_UTF-8", it is not 
an illegal name.

So looks like the spec definition is further tightening the Charsets 
accepted by java in that the name can only start with a letter or a 
digit. How do we interpret *must* ?



So

Richard Liang wrote:

> Hello Tim,
>
> I'm wondering why I did not just copy the first sentence. :-)
>
> "A charset name **must** begin with either a letter or a digit."  Does 
> this mean if the charset name which begin with neither a letter nor a 
> digit should be regarded as an illegal charset name?
>
>
> Richard Liang
> China Software Development Lab, IBM
>
>
>
> Tim Ellison wrote:
>
>> Richard Liang wrote:
>>  
>>
>>> Hello Tim,
>>>
>>> I think this is caused by different understanding of the java spec:
>>>
>>> A charset name **must** begin with either a letter or a digit. The 
>>> empty
>>> string is not a legal charset name....
>>>
>>> What do think the implication of "must" here? :-)
>>>     
>>
>>
>> But the name isn't empty, it is "-UTF-8" ?  I must be missing 
>> something...
>>
>> Regards,
>> Tim
>>
>>
>>  
>>
>>> Tim Ellison (JIRA) wrote:
>>>    
>>>
>>>>     [
>>>> http://issues.apache.org/jira/browse/HARMONY-68?page=comments#action_12366784

>>>>
>>>> ]
>>>> Tim Ellison commented on HARMONY-68:
>>>> ------------------------------------
>>>>
>>>> The test looks invalid to me.  You shoud only expect an
>>>> java.nio.charset.IllegalCharsetNameException if the name itself
>>>> contains disallowed characters, and both underscore and dash are
>>>> permitted.
>>>>
>>>> The code     Charset.isSupported("-UTF-8")
>>>>
>>>> should return false, not throw an exception.
>>>>
>>>>  
>>>>      
>>>>
>>>>> java.nio.charset.Charset.isSupported(String charsetName) does not
>>>>> throw IllegalCharsetNameException for spoiled standard sharset name
>>>>> -------------------------------------------------------------------------------------------------------------------------------------

>>>>>
>>>>>
>>>>>
>>>>>          Key: HARMONY-68
>>>>>          URL: http://issues.apache.org/jira/browse/HARMONY-68
>>>>>      Project: Harmony
>>>>>         Type: Bug
>>>>>   Components: Classlib
>>>>>     Reporter: Svetlana Samoilenko
>>>>>  Attachments: charset_patch.txt
>>>>>
>>>>> According to j2se 1.4.2 specification for Charset.isSupported(String
>>>>> charsetName)  the method must throw IllegalCharsetNameException  "if
>>>>> the given charset name is illegal ". "Legal charset name must begin
>>>>> with either a letter or a digit. The test listed below shows that
>>>>> there is no the exception  if to insert "-" or "_" symbols before
>>>>> standard sharset name, for example "-UTF-8" or "_US-ASCII".
>>>>> Moreover the method returns "true" in this case.
>>>>> BEA also does not throw the exception but returns "false".
>>>>> Code to reproduce: import java.nio.charset.*;  public class test2 
>>>>> {     public static void main (String[] args) {
>>>>>         // string starts neither a letter nor a digit         boolean
>>>>> sup=false;         try{
>>>>>              sup=Charset.isSupported("-UTF-8");
>>>>>              System.out.println("***BAD. should be exception;
>>>>> sup="+sup);              sup=Charset.isSupported("_US-ASCII");
>>>>>              System.out.println("***BAD. should be exception;
>>>>> sup="+sup);         } catch (IllegalCharsetNameException e) { 
>>>>>             System.out.println("***OK. Expected
>>>>> IllegalCharsetNameException " + e);         }           } } Steps to
>>>>> Reproduce: 1. Build Harmony (check-out on 2006-01-30) j2se subset as
>>>>> described in README.txt. 2. Compile test2.java using BEA 1.4 
>>>>> javac           
>>>>>
>>>>>> javac -d . test2.java                 
>>>>>
>>>>> 3. Run java using compatible VM (J9)           
>>>>>
>>>>>> java -showversion test2                 
>>>>>
>>>>> Output: C:\tmp>C:\jrockit-j2sdk1.4.2_04\bin\java.exe -showversion
>>>>> test2 java version "1.4.2_04" Java(TM) 2 Runtime Environment,
>>>>> Standard Edition (build 1.4.2_04-b05) BEA WebLogic JRockit(TM)
>>>>> 1.4.2_04 JVM (build ari-31788-20040616-1132-win-ia32, Native Threads,
>>>>> GC strategy: parallel) ***BAD. should be exception; sup=false
>>>>> ***BAD. should be exception; sup=false
>>>>> C:\tmp>C:\harmony\trunk\deploy\jre\bin\java -showversion test2 (c)
>>>>> Copyright 1991, 2005 The Apache Software Foundation or its licensors,
>>>>> as applicable. ***BAD. should be exception; sup=true
>>>>> ***BAD. should be exception; sup=true
>>>>> Suggested junit test case:
>>>>> ------------------------ CharserTest.java
>>>>> ------------------------------------------------- import
>>>>> java.nio.charset.*; import junit.framework.*; public class
>>>>> CharsetTest extends TestCase {     public static void main(String[]
>>>>> args) {         junit.textui.TestRunner.run(CharsetTest.class);     }
>>>>>     public void test_isSupported() {       boolean 
>>>>> sup=false;        // string starts neither a letter nor a 
>>>>> digit         try{
>>>>>             sup=Charset.isSupported("-UTF-8");
>>>>>             fail("***BAD. should be exception
>>>>> IllegalCharsetNameException");         } catch
>>>>> (IllegalCharsetNameException e) {  //expected
>>>>>         }
>>>>>         // string starts neither a letter nor a digit         try{
>>>>>              sup=Charset.isSupported("_US-ASCII");
>>>>>              fail("***BAD. should be exception
>>>>> IllegalCharsetNameException");          } catch
>>>>> (IllegalCharsetNameException e) {  //expected
>>>>>         }
>>>>>    } }
>>>>>             
>>>>
>>>>         
>>>
>>
>>   
>
>

-- 
Karan Singh


Mime
View raw message