harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexey Petrenko" <alexey.a.petre...@gmail.com>
Subject Re: [classlib][luni][charset]Java canonical charset name(Re: [jira] Commented: (HARMONY-4196) [classlib][luni] InputStreamReader can't handle UnicodeBig encoding)
Date Wed, 15 Aug 2007 10:38:43 GMT
Sounds reasonable.

SY, Alexey

2007/8/15, Yang Paulex <paulex.yang@gmail.com>:
> Oops...I was going to send this to dev list...
>
> ---------- Forwarded message ----------
> From: Yang Paulex <paulex.yang@gmail.com>
> Date: 2007-8-15 下午3:54
> Subject: [classlib][luni][charset]Java canonical charset name(Re: [jira]
> Commented: (HARMONY-4196) [classlib][luni] InputStreamReader can't handle
> UnicodeBig encoding)
> To: "Alexei Zakharov (JIRA)" <jira@apache.org>
>
> It's yet another historical/canonical encoding issue in Java platform,
> java.io/lang has old/non-standard canonical name with Unicode as well as
> java.nio, here's a link on the mapping for Java SE 5:[1] , and here's for
> Java SE 6:[2]
>
> The difference between "UnicodeBIg" and "UnicodeBigUnmarked"(i.e., UTF-16BE)
> is, according to the explanation on the tables[1][2], is the UnicodeBig has
> BOM("0xFEFF" for big endian). The difference applies to UnicodeLittle and
> UnicodeLittleUnmarked, too.
>
> My suggestion is to just map the "UnicodeBig" and "UnicodeLittle" to
> "utf-16" in InputStreamReader and OutputStreamWriter's constructors, because
> utf-16 can recognize the BOM and adapt to the byte stream accordingly. We
> may also need to map other java.io canonical name to java.nio name(currently
> there's only a reverse map for this) accordingly.  I haven't tested if it is
> necessarythough.
>
> [1]http://java.sun.com/j2se/1.5.0/docs/guide/intl/encoding.doc.html
> [2]http://java.sun.com/javase/6/docs/technotes/guides/intl/encoding.doc.html.
>
> 2007/7/25, Alexei Zakharov (JIRA) <jira@apache.org>:
> >
> >
> >     [
> > https://issues.apache.org/jira/browse/HARMONY-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515050]
> >
> > Alexei Zakharov commented on HARMONY-4196:
> > ------------------------------------------
> >
> > I've committed Harmony-4196-InputStreamReader_diagnostics.patch at the
> > revision 559141. Hope this helps.
> >
> > > [classlib][luni] InputStreamReader can't handle UnicodeBig encoding
> > > -------------------------------------------------------------------
> > >
> > >                 Key: HARMONY-4196
> > >                 URL: https://issues.apache.org/jira/browse/HARMONY-4196
> > >             Project: Harmony
> > >          Issue Type: Bug
> > >          Components: Classlib
> > >            Reporter: Vasily Zakharov
> > >            Assignee: Alexei Zakharov
> > >            Priority: Minor
> > >         Attachments: Harmony-4196-InputStreamReader_diagnostics.patch
> > >
> > >
> > > Consider the following simple test:
> > > import java.io.*;
> > > public class Test {
> > >     public static void main(String[] args) {
> > >         try {
> > >             new InputStreamReader(new ByteArrayInputStream(new byte[]
> > {(byte) 0xFE, (byte) 0xFF}), "UnicodeBig");
> > >             System.out.println("SUCCESS");
> > >         } catch (Throwable e) {
> > >             System.out.println("FAIL:");
> > >             e.printStackTrace(System.out);
> > >         }
> > >     }
> > > }
> > > Output on RI:
> > > SUCCESS
> > > Output on Harmony (both DRL VM and IBM VM):
> > > FAIL:
> > > java.io.UnsupportedEncodingException
> > >         at java.io.InputStreamReader.<init>( InputStreamReader.java:104)
> > >         at Test.main(Test.java:6)
> > > Additional investigation shows that the cause for this exception is:
> > > java.nio.charset.UnsupportedCharsetException: The unsupported charset
> > name is "UnicodeBig".
> > >         at java.nio.charset.Charset.forName(Charset.java:564)
> > >         at java.io.InputStreamReader.<init>(InputStreamReader.java:99)
> > >         at Test.main(Test.java:5)
> > > Interesting point is, the direct call to Charset.forName("UnicodeBig")
> > causes the same exception on RI also.
> > > So it seems the problem is not in Charset but in InputStreamReader
> > itself.
> >
> > --
> > This message is automatically generated by JIRA.
> > -
> > You can reply to this email to add a comment to the issue online.
> >
> >
>
>
> --
> Paulex Yang
> China Software Development laboratory
> IBM
>
> --
> Paulex Yang
> China Software Development laboratory
> IBM
>
Mime
View raw message