hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9801) Configuration#writeXml uses platform defaulting encoding, which may mishandle multi-byte characters.
Date Tue, 30 Jul 2013 17:39:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13724141#comment-13724141
] 

Chris Nauroth commented on HADOOP-9801:
---------------------------------------

Thanks to [~daijy] for finding and reporting this bug via Hive testing on Windows, where the
default encoding is CP-1252.
                
> Configuration#writeXml uses platform defaulting encoding, which may mishandle multi-byte
characters.
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-9801
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9801
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: conf
>    Affects Versions: 3.0.0, 1-win, 2.1.0-beta, 1.3.0
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>
> The overload of {{Configuration#writeXml}} that accepts an {{OutputStream}} does not
set encoding explicitly, so it chooses the platform default encoding.  Depending on the platform's
default encoding, this can cause incorrect output data when encoding multi-byte characters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message