From Damjan Jovanovic <>
Subject Re: [compress] 2.0: Reading and Writing Archives
Date Wed, 22 Jan 2014 04:22:25 GMT
On Wed, Jan 8, 2014 at 4:41 PM, Stefan Bodewig <> wrote:
> Hi,
> putting the exact representation of an archive entry aside I've put down
> an idea of the API for reading and writing archives together with a POC
> port of the AR classes for this API.  All is inside
> The port doesn't look pretty but I wanted to get there quickly and
> change as little as possible, partly to see how much effort porting the
> existing code base would be.  In particular I copied IOUtils into the AR
> package so I don't have to thing about a proper package right now.  I
> also didn't care about Java < 7 so far.
> Please have a look (more on the interfaces than the actual
> implementation) and show me how wrong I am :-)
> Some points I'd like to highlight and discuss:
> * ArchiveInput and ArchiveOutput are not Streams (or Channels)
>   themselves
>   This is unline Archive*Stream in 1.x
>   Emmanuel brought this up in a chat between the two of us and I agreed
>   with him.  You don't really use them as a stream but rather as a
>   stream per entry.
>   For Compressor* I'd still wrap streams/channels, different issue.
> * Using Channels rather than Streams
>   I'm a bit torn about this.  I did so because I'd prefer to base
>   ZipFile and friends on SeekableByteStream rather than RandomAccessFile
>   - so it would make the API look more symmetric.
>   Drawbacks I've already found
>   - no skip in ReadableByteChannel so you are forced to read data even
>     if something more efficient could be done.  This smells like another
>     IOUtils method.
>   - worse, no mark/reset or pushback, this is going to make format
>     detection uglier as we have to rewind the channel in a different way
>   Another concern might be that Compress 2.0 might get delayed because
>   proting effort was bigger - I've deliberately taken the*
>   route to wrap the existing stream based API in ArArchiveInput and it
>   seems to work (although likely is suboptimal).  Going all-in on
>   Channels in ArArchiveOutput didn't look much more difficult either,
>   but the I/O part of output is simpler anyway.
> * Checked vs Unchecked exceptions
>   I would love to make ArchiveInput be an Iterator over the entries but
>   can't do so as the things we'd need to do in next() might throw an
>   IOException.  One option may be to introduce an unchecked
>   ArchiveException and wrap al checked exceptions (and do so throughout
>   the API).

Doesn't sound very appealing.

> * RandomAccessArchiveInput as a generalization of ZipFile
>   This extends ArchiveInput so if you ask for an ArchiveInput to a file
>   and the format doesn't support a stream-like interface (like 7z) you
>   can still obtain one.  This is helped a lot by the fact that
>   ArchiveInput is not a stream itself.
> * I'm not sure about ArchiveInput#getChannel
>   Should next return a Pair of ArchiveEntry and Channel instead?

I don't think so, you might not want to look at an ArchiveEntry's
contents, or it might be empty.

> * tiny change to the contract of ArchiveOutput finish
>   finish used to throw an exception if you didn't call closeEntry for
>   the last entry while putEntry closes the previous entry.  This looked
>   inconsistent and finish now silently closes the entry as well.
> Stefan


