harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Varlamov <alexey.v.varla...@gmail.com>
Subject Re: Shall we change our file.encoding
Date Thu, 16 Jul 2009 07:27:48 GMT
The main point of the HARMONY-3736 was: why any VM should care about
classlib-specific properties? Let classlib do it, not DRLVM.

Regards,
Alexey

2009/7/16, Charles Lee <littlee1032@gmail.com>:
> Hi guys,
>
> I have add the locale function in the drlvm, the patch is attached. Please
> try this new patch on the linux.
>
> The patch should work on the linux but fail on the windows. Because windows
> returns code page not charset from the setlocale. I hv tried long time to
> get the charset name from the codepage, for example:
> CPINFOEX cpInfoEx;
> BOOL iReturn = GetCPInfoEx(CP_ACP,0, &cPInfoEx);
> if (iReturn > 0) {
>     printf("FULL NAME %s\n", cPinfoEx,CodePageName);
> }
> But I only get the full name without any format.
>
> There is code page identifiers map in the msdn, detail here. I may hard code
> this map in the file. But the note on the msdn says:
>      "ANSI code pages can be different on different computers, or can be
> changed for a single computer, leading to data corruption. For the most
> consistent results, applications should use Unicode, such as UTF-8 or
> UTF-16, instead of a specific code page."
> I am afraid hard-code will fail on some machines. (By the way, this seems
> the UTF-8 is suggested to be the default again :-)
>
> There is also a class Encoding in the VC++, detail here. But we can not use
> it here.
>
> So anyone knows some thing about locale on the windows?
> Again, shall use UTF-8 as our default?
>
>
> On Wed, Jul 15, 2009 at 2:12 PM, Charles Lee <littlee1032@gmail.com> wrote:
> > That seems we should add it in the drlvm.
> >
> >
> >
> >
> >
> > On Wed, Jul 15, 2009 at 1:58 PM, Regis <xu.regis@gmail.com> wrote:
> >
> > >
> > > Nathan Beyer wrote:
> > >
> > > > Is the IBM VME dealing with this correctly? Do we just need to fix
> DRLVM?
> > > >
> > >
> > > Yes, I only tested on Linux, IBM VME set the property correctly.
> > >
> > >
> > >
> > >
> > >
> > > >
> > > > On Wed, Jul 15, 2009 at 12:25 AM, Regis<xu.regis@gmail.com> wrote:
> > > >
> > > > > Kevin Zhou wrote:
> > > > >
> > > > > > Yea, from luniglob.c, CL attempts to read the "file.encoding"
> property
> > > > > > adown
> > > > > > VM but fails to get the correct encoding.
> > > > > >
> > > > > > Regis, do you know any other specific ways that CL can gain
the
> right
> > > > > > property?
> > > > > >
> > > > > We can get from OS directly. Maybe just read env variables on Linux?
> > > > >
> > > > >
> > > > > > Wed, Jul 15, 2009 at 9:59 AM, Regis <xu.regis@gmail.com>
wrote:
> > > > > >
> > > > > >
> > > > > > > Charles Lee wrote:
> > > > > > >
> > > > > > >
> > > > > > > > Hi Nanthan,
> > > > > > > >
> > > > > > > > If the file encoding derive from the OS, it should
be the some
> bugs in
> > > > > > > > it
> > > > > > > > because on my LINUX machine the locale is en_US.UTF-8.
Our
> default codec
> > > > > > > > is
> > > > > > > > still ISO8859-1. Do you know where can we found such
codes?
> > > > > > > >
> > > > > > > >
> > > > > > > Classlib expected vm do this and set the property, but
it
> didn't, so we
> > > > > > > have to do this by ourselves.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > On Tue, Jul 14, 2009 at 10:17 PM, Nathan Beyer
> <nbeyer@gmail.com> wrote:
> > > > > > > >
> > > > > > > >  Are we talking about windows or linux?the default
file
> encoding should
> > > > > > > >
> > > > > > > > > derive from the OS. I believe that's defined
by the specs.
> > > > > > > > >
> > > > > > > > > Sent from my iPhone
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Jul 14, 2009, at 5:51 AM, Charles Lee
> <littlee1032@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > >  On Tue, Jul 14, 2009 at 6:12 PM, Jimmy,Jing
Lv
> <firepure@gmail.com>
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >  Hi,
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >  Charles, I believe UTF-8 is the default
encoding for
> RI, and it
> > > > > > > > > > > sounds
> > > > > > > > > > > reasonable.
> > > > > > > > > > >  BTW, it may encounter some compatibility
problem, maybe
> we need to
> > > > > > > > > > > run
> > > > > > > > > > > more tests to verify?
> > > > > > > > > > >
> > > > > > > > > > > 2009/7/14 Charles Lee <littlee1032@gmail.com>
> > > > > > > > > > >
> > > > > > > > > > >  Hi guys:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > I am doing some test cases on
the ant junit test case
> and meeting
> > > > > > > > > > > > some
> > > > > > > > > > > > encoding problems. I find they
are maybe caused by the
> different
> > > > > > > > > > > > default
> > > > > > > > > > > > encoding from RI and harmony.
My local is en_US.UTF-8,
> RI default is
> > > > > > > > > > > >
> > > > > > > > > > > >  UTF-8
> > > > > > > > > > > >
> > > > > > > > > > >  but harmony is 8859-1. And then I
have encountered
> > > > > > > > > > >
> > > > > > > > > > > >
> HARMONY-3736<https://issues.apache.org/jira/browse/HARMONY-3736>,
> > > > > > > > > > > > and the two diffs attached on
that issue. It seems we
> always get
> > > > > > > > > > > > 8859-1.
> > > > > > > > > > > > Because: (correct me if wrong
:-)
> > > > > > > > > > > >
> > > > > > > > > > > > 1. we remove the set code in the
vm. we will always
> get null if we
> > > > > > > > > > > > call
> > > > > > > > > > > >
> > > > > > > > > > > >  vm
> > > > > > > > > > > >
> > > > > > > > > > >  method
> > > > > > > > > > >
> > > > > > > > > > > > 2. we set the file.encode in the
libglob.c, if we got
> null from vm,
> > > > > > > > > > > > we
> > > > > > > > > > > >
> > > > > > > > > > > >  set
> > > > > > > > > > > >
> > > > > > > > > > >  Sorry, it should be luniglob.c
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >  8859-1.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > 3. we can not set file.encode
on the run time.
> > > > > > > > > > > >
> > > > > > > > > > > > ant use UTF-8 to encode filename
which contains the
> non-ascii
> > > > > > > > > > > > character.
> > > > > > > > > > > > So why we use iso8859-1 as our
unchangeable default?
> > > > > > > > > > > > From the wiki
> http://en.wikipedia.org/wiki/ISO8859-1, it says "In
> > > > > > > > > > > > computing
> > > > > > > > > > > > applications, encodings that provide
full UCS support
> (such as
> > > > > > > > > > > >
> UTF-8<http://en.wikipedia.org/wiki/UTF-8>and
> > > > > > > > > > > > UTF-16
> <http://en.wikipedia.org/wiki/UTF-16>) are finding
> increasing
> > > > > > > > > > > >
> > > > > > > > > > > >  favor
> > > > > > > > > > > >
> > > > > > > > > > >  over encodings based on ISO 8859-1."
Should we simply
> change
> > > > > > > > > > > iso8859-1
> > > > > > > > > > >
> > > > > > > > > > > > to
> > > > > > > > > > > > utf-8?
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Yours sincerely,
> > > > > > > > > > > > Charles Lee
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > >
> > > > > > > > > > > Best Regards!
> > > > > > > > > > >
> > > > > > > > > > > Jimmy, Jing Lv
> > > > > > > > > > > China Software Development Lab, IBM
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Yours sincerely,
> > > > > > > > > > Charles Lee
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > --
> > > > > > > Best Regards,
> > > > > > > Regis.
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > > --
> > > > > Best Regards,
> > > > > Regis.
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > Best Regards,
> > > Regis.
> > >
> >
> >
> >
> > --
> > Yours sincerely,
> > Charles Lee
> >
> >
>
>
>
> --
> Yours sincerely,
> Charles Lee
>
>
>

Mime
View raw message