commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter De Maeyer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (COMPRESS-206) TarArchiveOutputStream sometimes writes garbage beyond the end of the archive
Date Sun, 30 Dec 2012 17:06:13 GMT

    [ https://issues.apache.org/jira/browse/COMPRESS-206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13541107#comment-13541107
] 

Peter De Maeyer commented on COMPRESS-206:
------------------------------------------

Indeed, TarArchiveOutputStream does not write garbage as such. Initially I thought the second
EOF block was "garbage". When I dug into the code I understood that the second EOF block was
intentional. I then realized that the actual problem was that TarArchiveInputStream did not
read back the second EOF block.

Anyway, the point is that there is a unit test illustrating my use case, and the patch fixes
it. COMPRESS-202 is merely about documentation, but my patch really fixes _behavior_. So I
would argue that my patch does a lot more than just address COMPRESS-202...

Rephrasing the issue: "TarArchiveOutputStream sometimes writes bytes at the end of the archive
which are never consumed by TarArchiveInputStream".
                
> TarArchiveOutputStream sometimes writes garbage beyond the end of the archive
> -----------------------------------------------------------------------------
>
>                 Key: COMPRESS-206
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-206
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Compressors
>    Affects Versions: 1.0, 1.4.1
>         Environment: Linux x86
>            Reporter: Peter De Maeyer
>             Fix For: 1.5
>
>         Attachments: COMPRESS-206.patch
>
>
> For some combinations of file lengths, the archive created by TarArchiveOutputStream
writes garbage beyond the end of the TAR stream. TarArchiveInputStream can still read the
stream without problems, but it does not read beyond the garbage. This is problematic for
my use case because I write a checksum _after_ the TAR content. If I then try to read the
checksum back, I read garbage instead.
> Functional impact:
> * TarArchiveInputStream is asymmetrical with respect to TarArchiveOutputStream, in the
sense that TarArchiveInputStream does not read everything that was written by TarArchiveOutputStream.
> * The content is unnecessarily large. The garbage is totally unnecessarily large: ~10K
overhead compared to Linux command-line tar.
> This symptom is remarkably similar to #COMPRESS-81, which is supposedly fixed since 1.1.
Except for the fact that this issue still exists... I've tested this with 1.0 and 1.4.1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message