harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Beyer <ndbe...@apache.org>
Subject Re: Shall we change our file.encoding
Date Mon, 20 Jul 2009 23:44:44 GMT
I don't think the Windows logic will be quite that simple - I think
we'll have to recreate the mapping defined by the Windows API [1]. In
the case of 936, we'd convert to gb2312, per [1].

The default value is going to vary on each platform. On Windows, if
the we can't determine locale information, then we'll default to "en"
and encoding of "Windows-1252"

-Nathan

[1] http://msdn.microsoft.com/en-us/library/dd317756%28VS.85%29.aspx

On Mon, Jul 20, 2009 at 5:05 AM, Charles Lee<littlee1032@gmail.com> wrote:
> Hi guys,
>
> A new patch is attached but still fail on the windows. It *seems* VM do not
> support CP936.
>
> 1. I have tried to hard code "CP936" in the luniglob.c, make the
> file.encoding always be CP936. The vm failed to launch with the message
> "HMYEXEL054E vm inner fault: can not create java/lang/String, FAILED to
> invoke JVM" (The original msg is Chinese, I am translating it)
> 2. I have tried to hard code "UTF-8" in the luniglob.c, make the
> file.encoding always be UTF-8. The vm sucessfully launch and tests have been
> passed.
>
> Does somebody know where the vm load the String? And what does "HMYEXEL054E"
> mean?
>
> On Sat, Jul 18, 2009 at 11:10 AM, Nathan Beyer <ndbeyer@apache.org> wrote:
>>
>> On Fri, Jul 17, 2009 at 6:03 AM, Alexey
>> Varlamov<alexey.v.varlamov@gmail.com> wrote:
>> > 2009/7/17, Nathan Beyer <ndbeyer@apache.org>:
>> >> On Thu, Jul 16, 2009 at 8:50 PM, Nathan Beyer<ndbeyer@apache.org>
>> >> wrote:
>> >> > On Thu, Jul 16, 2009 at 8:35 PM, Nathan Beyer<ndbeyer@apache.org>
>> >> > wrote:
>> >> >> On Thu, Jul 16, 2009 at 8:26 PM, Nathan Beyer<ndbeyer@apache.org>
>> >> >> wrote:
>> >> >>> On Thu, Jul 16, 2009 at 8:18 PM, Charles Lee<littlee1032@gmail.com>
>> >> >>> wrote:
>> >> >>>> Hi Nathan,
>> >> >>>>
>> >> >>>> What I got is 936, the code page identifier. Is there a
api for us
>> >> >>>> to map
>> >> >>>> 936 to the gb2312?
>> >> >>>
>> >> >>> Oh, the 'identifier' bit was missing - yeah, we'll need to
>> >> >>> translate
>> >> >>> that into a name of some sort. I'll poke around a bit and see
what
>> >> >>> I
>> >> >>> can find.
>> >> >>
>> >> >> We'll probably just have to put in a mapping ourselves based on
the
>> >> >> documentation. We'd call GetACP [1] and map that to a known alias
in
>> >> >> java.nio.charset that matches the definitions[2] of the identifiers.
>> >> >>
>> >> >> [1] http://msdn.microsoft.com/en-us/library/dd318070%28VS.85%29.aspx
>> >> >> [2] http://msdn.microsoft.com/en-us/library/dd317756%28VS.85%29.aspx
>> >> >
>> >> > This may be better - APR has a function for getting the OS default
>> >> > encoding. This would work across all platforms that APR supports and
>> >> > I
>> >> > believe we already use APR.
>> >> >
>> >> >
>> >> > http://apr.apache.org/docs/apr/1.3/group__apr__portabile.html#g6e21845a4a5f3b7dd107b2beea50c91e
>> >>
>> >> However, the Windows version of this is simply - return
>> >> apr_psprintf(pool, "CP%u", (unsigned) GetACP());. Which is essentially
>> >> "CP" + codePageId.
>> >>
>> >> And the Unix version of this method doesn't look very good for our
>> >> purposes.
>> >> >
>> >> > -Nathan
>> >
>> > Yep - that's why APR was not used here initially. I guess your idea of
>> > GetACP() + hardcoded mapping is the most suitable approach. We already
>> > have similar solution for timezone detection, see
>> > working_vm\vm\port\src\misc\win\timezone.c (which also should be moved
>> > to classlib eventually, HARMONY-2053).
>>
>> I'd be inclined to combine these all together into the portlib
>> (luni?). Perhaps in some sort of OS environment portion, which can be
>> used by the rest of the class library.
>>
>> -Nathan
>>
>> >
>> > --
>> > Alexey
>> >
>
>
>
> --
> Yours sincerely,
> Charles Lee
>
>

Mime
View raw message