hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-9801) Configuration#writeXml uses platform defaulting encoding, which may mishandle multi-byte characters.
Date Wed, 31 Jul 2013 22:41:49 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-9801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Nauroth updated HADOOP-9801:
----------------------------------

    Attachment: HADOOP-9801-trunk.1.patch

I'm attaching a patch to set UTF-8 explicitly.  I've also added a unit test that round-trips
saving and loading of multi-byte characters in property names and values.  If the config code
change is not present, then the new test fails if run on a platform that has a default encoding
that can't handle multi-byte characters (like Windows with CP-1252).
                
> Configuration#writeXml uses platform defaulting encoding, which may mishandle multi-byte
characters.
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-9801
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9801
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: conf
>    Affects Versions: 3.0.0, 1-win, 2.1.0-beta, 1.3.0
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: HADOOP-9801-branch-1.1.patch, HADOOP-9801-trunk.1.patch
>
>
> The overload of {{Configuration#writeXml}} that accepts an {{OutputStream}} does not
set encoding explicitly, so it chooses the platform default encoding.  Depending on the platform's
default encoding, this can cause incorrect output data when encoding multi-byte characters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message