commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Woo Ju Shin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COMPRESS-212) TarArchiveEntry getName() returns wrongly encoded name even when you set encoding to TarArchiveInputStream
Date Thu, 03 Jan 2013 23:16:12 GMT

    [ https://issues.apache.org/jira/browse/COMPRESS-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543402#comment-13543402
] 

Woo Ju Shin commented on COMPRESS-212:
--------------------------------------

The archive is created on Red Hat Linux using tar command. The files that are included in
this archive is created by cat command with an argument for filename which is encoded in "UTF-8".
And also the system default encoding is specified to be "UTF-8".
                
> TarArchiveEntry getName() returns wrongly encoded name even when you set encoding to
TarArchiveInputStream
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: COMPRESS-212
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-212
>             Project: Commons Compress
>          Issue Type: Bug
>    Affects Versions: 1.4.1
>         Environment: Red Hat Enterprise Linux, MS Windows 7
>            Reporter: Woo Ju Shin
>            Priority: Minor
>
> I have two file systems. One is Red Hat Linux, the other is MS Windows.
> I created a *.tgz file in Red Hat Linux and tried to decompress it in MS Windows using
Commons Compress.
> The default system encoding are different. UTF-8 in Red Hat Linux and CP949 in MS Windows.
> It seems that the file name encoding follows the default encoding even though when I
use the following to untar it.
> FileInputStream fis = new FileInputStream(new File(*.tgz));
> TarArchiveInputStream zis = new TarArchiveInputStream(new BufferedInputStream(fis),encodingOfRedHatLinux);
> while ((entry = (TarArchiveEntry)zis.getNextEntry()) != null)
> {
> entry.getName(); // filename is not UTF-8 it is encoded in CP949 and so the filename
isn't consistent
> }
> By referring to this
>     /**
>      * Constructor for TarInputStream.
>      * @param is the input stream to use
>      * @param encoding name of the encoding to use for file names
>      * @since Commons Compress 1.4
>      */
>     public TarArchiveInputStream(InputStream is, String encoding) {
>         this(is, TarBuffer.DEFAULT_BLKSIZE, TarBuffer.DEFAULT_RCDSIZE, encoding);
>     }
> encoding should be used for file names.
> But actually this doesn't seem to work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message