commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Bodewig (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COMPRESS-212) TarArchiveEntry getName() returns wrongly encoded name even when you set encoding to TarArchiveInputStream
Date Thu, 03 Jan 2013 16:34:13 GMT

    [ https://issues.apache.org/jira/browse/COMPRESS-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543039#comment-13543039
] 

Stefan Bodewig commented on COMPRESS-212:
-----------------------------------------

Just a few quick questions:

How has the original archive been created?  When reading the archive, how do you sepcify the
encoding exactly?  "utf-8"?
                
> TarArchiveEntry getName() returns wrongly encoded name even when you set encoding to
TarArchiveInputStream
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: COMPRESS-212
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-212
>             Project: Commons Compress
>          Issue Type: Bug
>    Affects Versions: 1.4.1
>         Environment: Red Hat Enterprise Linux, MS Windows 7
>            Reporter: Woo Ju Shin
>            Priority: Minor
>
> I have two file systems. One is Red Hat Linux, the other is MS Windows.
> I created a *.tgz file in Red Hat Linux and tried to decompress it in MS Windows using
Commons Compress.
> The default system encoding are different. UTF-8 in Red Hat Linux and CP949 in MS Windows.
> It seems that the file name encoding follows the default encoding even though when I
use the following to untar it.
> FileInputStream fis = new FileInputStream(new File(*.tgz));
> TarArchiveInputStream zis = new TarArchiveInputStream(new BufferedInputStream(fis),encodingOfRedHatLinux);
> while ((entry = (TarArchiveEntry)zis.getNextEntry()) != null)
> {
> entry.getName(); // filename is not UTF-8 it is encoded in CP949 and so the filename
isn't consistent
> }
> By referring to this
>     /**
>      * Constructor for TarInputStream.
>      * @param is the input stream to use
>      * @param encoding name of the encoding to use for file names
>      * @since Commons Compress 1.4
>      */
>     public TarArchiveInputStream(InputStream is, String encoding) {
>         this(is, TarBuffer.DEFAULT_BLKSIZE, TarBuffer.DEFAULT_RCDSIZE, encoding);
>     }
> encoding should be used for file names.
> But actually this doesn't seem to work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message