commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Bodewig (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COMPRESS-183) Support for de/encoding of tar entry names other than plain 8BIT conversion.
Date Sat, 17 Mar 2012 06:37:39 GMT

    [ https://issues.apache.org/jira/browse/COMPRESS-183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231866#comment-13231866
] 

Stefan Bodewig commented on COMPRESS-183:
-----------------------------------------

The zip package already contains code that is similar to the codec in your patch, I'll look
into reusing that.

Modern (POSIX) tars support non-ASCII encodings via PAX extension headers, which current trunk
already supports on the reading side - it shouldn't be too hard for the writing side.
                
> Support for de/encoding of tar entry names other than plain 8BIT conversion.
> ----------------------------------------------------------------------------
>
>                 Key: COMPRESS-183
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-183
>             Project: Commons Compress
>          Issue Type: Improvement
>          Components: Archivers
>    Affects Versions: 1.3
>            Reporter: Joao Schim
>              Labels: patch
>             Fix For: 1.4
>
>         Attachments: patch-tar-name-encoding.diff, patch-tar-name-encoding.diff, patch-tar-name-encoding.diff
>
>
> The names of tar entries are currently encoded/decoded by means of plain 8bit conversions
of byte to char and vice-versa. This prohibits the use of encodings like UTF8 in the file
names. Whether the use of UTF8 (or any other non ASCII) in file names is sensible is a chapter
of its own. However tar archives that contain files which names have been encoded with UTF8
do float around. These files currently can not be read correctly by commons-compress due to
the encoding being hardcoded to plain 8BIT only. 
> The supplied patch allows to use encodings other than 8BIT using a TarArchiveCodec structure.
It does not change the standard functionality, but adds to it the possibility of using a different
encoding. 
> A method was added to the TarUtilsTest junit test to test the added functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message