harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Regis <xu.re...@gmail.com>
Subject Re: [jira] Commented: (HARMONY-6408) [classlib][luni]OutputStreamWriterTest got java.nio.BufferOverflowException
Date Tue, 22 Dec 2009 03:12:35 GMT
On 2009-12-21 22:07, Ray Chen wrote:
> Hi Regis,
> RI shows GBK, but harmony shows GB2312
>
> I have found that "(*vmInterface)->GetSystemProperty (vmInterface,
> "file.encoding",&propVal);"
> It get GBK when uses IBMvm, but got NULL when uses DRLVM
>
> So when used DRLVM it comes to getOSCharset() and getOSCharset()
> invoke GetLocaleInfo() in turn.
>
> You can find the details in windows/helpers.c, the cp is 936 on my
> machine and got gb2312 in charsetmap.h.
>
> So the question is change the vm GetSystemProperty, or our encoding mapping?

According to [1] CP936 equals to GBK

[1] http://msdn.microsoft.com/en-us/goglobal/bb964654.aspx

While from the mapping table [2]

[2] http://msdn.microsoft.com/en-us/library/dd317756%28VS.85%29.aspx

it's mapped to GB2312, but in Windows Control Panel->Regional and Language 
options->Advanced, CP936 is mapped to GBK.

After google, lots of people said CP936 included more character than GB2312,
so I think GBK is more reasonable.

>
> On Mon, Dec 21, 2009 at 8:48 PM, Regis<xu.regis@gmail.com>  wrote:
>> On 2009-12-21 19:40, Ray Chen wrote:
>>>
>>> Hi Regis,
>>>
>>> I rebuild the luni native code in trunk, and found that its behavior
>>> same as java6 now, I don't know what happened.
>>> Sorry for confusing.
>>
>> That happens often ;)
>>
>>
>> [1] is a discussion before
>>
>> [1] http://markmail.org/thread/ef4lwojc23vb6vnv
>>
>>>
>>> Now the problem is simple, I think we got GB2312 as the default encode
>>> on my machine, I will do more investigation.
>>
>> What's default encode of your OS?
>>
>>>
>>> On Mon, Dec 21, 2009 at 5:19 PM, Regis<xu.regis@gmail.com>    wrote:
>>>>
>>>> On 2009-12-21 15:54, Ray Chen (JIRA) wrote:
>>>>>
>>>>>      [
>>>>>
>>>>> https://issues.apache.org/jira/browse/HARMONY-6408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793106#action_12793106
>>>>> ]
>>>>>
>>>>> Ray Chen commented on HARMONY-6408:
>>>>> -----------------------------------
>>>>>
>>>>> Hi,
>>>>> I have investigated this issue, found that if uses IBM vm, the default
>>>>> encoding on my machine is GB18030 while using DRLVM it is GB2312.
>>>>> I searched GB18030, found it on http://en.wikipedia.org/wiki/GB_18030
>>>>> which says GB2312 should be replaced with GB18030.
>>>>>
>>>>> The question is why different vm got different default file encoding?
>>>>> It seems that System.ensureProperties() got the default file encoding,
>>>>> in this function calls a static native method named "getEncoding()".
>>>>> But I can not find this native funtion in my classlib working copy.
>>>>> Does anyone know about this? Is this a classlib bug or vm bug?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> Do you mean the value of property "file.encoding"? It's set at
>>>> modules/luni/src/main/native/luni/shared/luniglob.c:159
>>>>
>>>> We first check whether the value is NULL, if so, call getOSCharset to get
>>>> default value from OS ( you can reference HARMONY-6279 for more details).
>>>> I
>>>> guess IBM vm set the value to GB18030, but drlvm doesn't set it, and then
>>>> we
>>>> use getOSCharset, get a different charset. The charset should be same
>>>> with
>>>> your local setting. According to you previous comments on JIRA, seems
>>>> GB2312
>>>> is correct.
>>>>
>>>> And I think we should fix the test not to depends on local environment.
>>>>
>>>> --
>>>> Best Regards,
>>>> Regis.
>>>>
>>>
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Regis.
>>
>
>
>


-- 
Best Regards,
Regis.

Mime
View raw message