commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bear Giles <>
Subject Re: [compress] not reading archive stream completely
Date Thu, 17 Jan 2013 17:38:37 GMT
I think a number of applications use a concatenation of a standard archive
format and custom data. The most well known probably is .rpm which is/was a
cpio stream immediately followed by additional information (iirc - it might
go the other way). In any case a developer might expect to have the input
stream placed at the end of the archive, not the end of the input stream.

On the zip 'central directory' - one of the big 'wins' for zip format is
that it allows you to seek directly to a file instead of having to scan it
sequentially. For various reasons (e.g., the need to support streaming
modes) it has to go at the end of the archive. The unix backup format has
the directory at the top of the archive but it was optimized for backups
that spanned multiple tapes so the cost of precomputing the values was
worth it.


On Thu, Jan 17, 2013 at 7:42 AM, Torsten Curdt <> wrote:

> > For tar it would be one block (usualy 512 bytes), for zip the full
> > central directory has to be read which could be quite a bit.
> >
> Urgh. Because that's at the end for zips? That's not so good then.
> >
> > I currently plan to implement it inside getNextEntry as it is cleaner.
> > In the tar case I vaguely recall some implementation only write one EOF
> > marker so a more careful aproach is needed in order to not read beyond
> > the end of the archive (likely mark and reset if the stream supports
> > mark).
> >
> Hm. That suddenly makes (2) much interesting again.
> I can see the back and forth on this :)

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message