commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <>
Subject Re: [compress] Need API Feedback/Advice for ZipArchiveOutputStream ans ZIP64
Date Mon, 08 Aug 2011 05:07:38 GMT
On 8/6/11 9:30 PM, Stefan Bodewig wrote:
> On 2011-08-06, Phil Steitz wrote:
>> On 8/5/11 9:40 PM, Stefan Bodewig wrote:
>>> This means ZipArchiveOutputStream must decide whether it is going to use
>>> the ZIP64 format before it knows whether it would actually need it or
>>> not.  If it signals it is going to use ZIP64 then an implementation that
>>> doesn't support ZIP64 (like Compress 1.2 or may fail to
>>> read the archive, which is bad if the entry turns out to be smaller than
>>> 4GiB.  If it doesn't signal ZIP64 it can't write big entries at all.
>>> This decision can be made at the granularity of a single entry.  I.e. it
>>> is possible to not use ZIP64 for the majority of entries and enable it
>>> for individual entries.
>>> IMHO there is no right or wrong decision here that the library could
>>> make.  The user-code will have to decide whether ZIP64 should be enabled
>>> or not.  The main questions to me are whether we want to attach this
>>> decision to the stream or the entry itself and what the default should
>>> be.
>> Can you think of practical use cases where setting at the entry
>> level is needed?
> If there is a single entry that uses Zip64 features inside the archive
> then an implementation that doesn't support it is most likely going to
> choke on it anyway.  Compress 1.2 does.
> One - maybe contrieved - use case could come up for one of the other
> combinations ZipArchiveOutputStream has to support.  Writing entries of
> unknown size to a seekable stream.  Here each entry gets 20 extra bytes
> compared to Compress 1.2 that you could avoid by turning off Zip64
> support in general and selectively turning it on for entries.  OTOH,
> this implies you'd at least know whether the size was smaller or bigger
> than 4GiB, which is not that likely if you don't know the exact size.
> So no, no compelling use case.
>>> InfoZIP's ZIP has decided to make it an option for the whole archive
>>> (the command line doesn't offer much flexibility here) and make it
>>> default to ZIP64.
>>> My current thinking is that is a likely candidate for the
>>> receiving end of ZIPs we create, so it may be better to turn ZIP64 off
>>> by default, but I'm not sure.
>>> I'm leaning towards adding a setUseZip64(boolean) method at the level of
>>> ZipArchiveOutputStream and make it default to false.  This method could
>>> be called in between putArchiveEntry calls to make it apply selectively
>>> to indiviual entries.
>> Sounds reasonable.
>>> The name is totally open for debate since as it stands it sounds as if
>>> you could turn off all Zip64 features which I wouldn't want to do for
>>> the cases that can be dealt with transparently.  Then again it could use
>>> a Boolean argument with "null" meaning "do the best you can" and false
>>> "don't even use Zip64 if you think it is safe".
>> I don't get what you mean by "do the best you can."  Does that mean
>> turn it on when needed if somehow you know it is needed, per entry,
>> I assume?
> Actually I was thinking about what the method would mean for the other
> combinations as well.  A "null" value doesn't make sense for the
> specific case I'm asking about - here we need to decide what the default
> should be: Zip64 or not.  For the other cases "null" could mean "use
> Zip64 if you think you need it" i.e. what is omplemented right now,
> "true" could mean "always use it" and "false" could mean "never use it,
> throw an exception if you recognize it would be required".

I guess an alternative would be an enum with values "allowed," 
"always," and "never," which would work for the "other" cases; but
maybe be a little unnatural for the simple case above, where I guess
you would have to either throw on the "allowed" setting or view it
as synonymous with "always."   So probably best to do as you suggest
- Boolean with null meaningful in some combinations and not allowed
or meaningless in others.  Just make sure to clearly document how it
works.  As a test of whether or not it will work, I would recommend
writing the javadoc and test cases first (which of course we all do
any way ;)

>> Libraries that try to be too smart tend to be hard on both users and
>> maintainers,
> Completely agreed.
> Stefan
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message