commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeremy Gustie (JIRA)" <>
Subject [jira] [Commented] (COMPRESS-336) Extended Standard TAR format prefix is 130 characters
Date Mon, 08 Feb 2016 18:41:39 GMT


Jeremy Gustie commented on COMPRESS-336:

Also, I don't think this would have been an issue if {{TarUtils.parseName}} read from the
front instead of the back, was that done because most fields are more likely to contain more
valid bytes then padding? If that is an important optimization, perhaps a sort of modified
binary search for the first NUL would still be resilient to issues like this (where a field
was split less then 50/50 in some format of TAR) while still avoiding the full linear search
for the terminator.

> Extended Standard TAR format prefix is 130 characters
> -----------------------------------------------------
>                 Key: COMPRESS-336
>                 URL:
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Archivers
>    Affects Versions: 1.10
>            Reporter: Jeremy Gustie
> A TAR archive created with star having an artype of "xstar" apparently limits the PREFIX
to 130 characters to accommodate an access time and a creation time (this much I was able
to learn from the star man page). I wasn't able to track down any specifics about the format,
but in at least the first example I found, it appears that the access and creation time are
stored as two space terminated ASCII numbers at the end of what would otherwise be the prefix.
> Currently, the code will read this type of archive and assume the prefix is 131 NULs
followed by the two ASCII time stamps. Needless to say, it makes a mess of the entry names.
> I'm not 100% sure of the implementation, but perhaps something like (with {{XSTAR_PREFIXLEN}}
== 130):
> {code}
> default: {
>   final int prefixlen = header[offset + XSTAR_PREFIXLEN + 1] == 0 ? XSTAR_PREFIXLEN :
>   String prefix = oldStyle
>     ? TarUtils.parseName(header, offset, prefixlen)
>     : TarUtils.parseName(header, offset, prefixlen, encoding);
>   // ...
> }
> {code}
> Maybe a separate feature request would be appropriate for capturing and exposing the
additional timestamps?

This message was sent by Atlassian JIRA

View raw message