commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Bourg <>
Subject Re: [compress] Pack200
Date Mon, 05 Sep 2011 08:25:23 GMT
That looks interesting. Does it provide a repack mode suitable for 
signing compressed jars?

Emmanuel Bourg

Le 04/09/2011 08:04, Stefan Bodewig a écrit :
> Hi,
> I've just committed Converter*Stream implementations for Pack200[1]
> which is a bit unusual in several ways.
> First of all it will (by design of the format) only work on compressing
> valid jar files.  Actually the result isn't likely to be compressed (in
> the sense of "smaller than the original") at all but expects another
> step of GZip compression in most cases.
> The second difference to the other compressors is that the API provided
> by the Java classlib doesn't lend itself to streaming at all.  There is
> a Packer/Unpacker that expects an InputStream and an OutputStream and
> converts from one to the other in a single blocking operation (even
> closing the input side when done).
> I have experimented with Piped*Streams as well as Ant/commons-exec-like
> stream pumping in order to provide a streaming experience but always ran
> into some edge cases where things broke down.  I'll give one example
> below.
> The current implementation of Pack200CompressorInputStream will
> pass the wrapped input and an OutputStream writing to a cache to the
> Unpacker synchronously inside the constructor, consuming the input
> completely.  It will then defer all read-operations to the cache.
> Likewise the Pack200CompressorOutputStream will buffer up all write
> operations in a cache and once finish() or close() is called the cache
> is converted to an InputStream that is then passed together with the
> originally wrapped output to the Packer and written synchronously.
> Caches can be in-memory (using ByteArray*Stream) or temporary files
> controlled by a constructor option with in-memory as the default and
> temp-files for cases where the archives are expected to be big.
> Because of this design the byte-count methods don't make any sense (do
> we count when data is written-to/read-from the cache or while the
> (Un)Packer is doing its work?) and haven't been implemented at all.
> The class names StreamMode and StreamSwitcher result from my attempts of
> using real streams and should be changed unless anybody else comes up
> with a working streaming solution.
> The biggest hurdle for any streaming solution is that there is always
> going to be some sort of intermediate buffer.  Something picks up data
> written to the output stream and makes it available to the input stream
> side.  Once the buffer is full, nothing can be written unless anybody
> reads input in a timely manner.
> In the case of a Pack200CompressorInputStream you don't have any control
> over when the user code is going to read the data and whether it is
> going to consume all of it at all.  For example if the stream is wrapped
> in a ZipArchiveInputStream (it represents a JAR, after all), it is never
> going to be consumed completely because the archive contains ZIP data at
> the end that is ignored by the input stream implementation.
> There are more cases where the Pack/Unpack operation would end up
> blocked so I decided to only code the more robust indirect solution for
> now.
> Stefan
> [1]
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message