commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Bodewig <>
Subject Re: [compress] Need API Feedback/Advice for ZipArchiveOutputStream ans ZIP64
Date Sun, 07 Aug 2011 04:30:40 GMT
On 2011-08-06, Phil Steitz wrote:

> On 8/5/11 9:40 PM, Stefan Bodewig wrote:
>> This means ZipArchiveOutputStream must decide whether it is going to use
>> the ZIP64 format before it knows whether it would actually need it or
>> not.  If it signals it is going to use ZIP64 then an implementation that
>> doesn't support ZIP64 (like Compress 1.2 or may fail to
>> read the archive, which is bad if the entry turns out to be smaller than
>> 4GiB.  If it doesn't signal ZIP64 it can't write big entries at all.

>> This decision can be made at the granularity of a single entry.  I.e. it
>> is possible to not use ZIP64 for the majority of entries and enable it
>> for individual entries.

>> IMHO there is no right or wrong decision here that the library could
>> make.  The user-code will have to decide whether ZIP64 should be enabled
>> or not.  The main questions to me are whether we want to attach this
>> decision to the stream or the entry itself and what the default should
>> be.

> Can you think of practical use cases where setting at the entry
> level is needed?

If there is a single entry that uses Zip64 features inside the archive
then an implementation that doesn't support it is most likely going to
choke on it anyway.  Compress 1.2 does.

One - maybe contrieved - use case could come up for one of the other
combinations ZipArchiveOutputStream has to support.  Writing entries of
unknown size to a seekable stream.  Here each entry gets 20 extra bytes
compared to Compress 1.2 that you could avoid by turning off Zip64
support in general and selectively turning it on for entries.  OTOH,
this implies you'd at least know whether the size was smaller or bigger
than 4GiB, which is not that likely if you don't know the exact size.

So no, no compelling use case.

>> InfoZIP's ZIP has decided to make it an option for the whole archive
>> (the command line doesn't offer much flexibility here) and make it
>> default to ZIP64.

>> My current thinking is that is a likely candidate for the
>> receiving end of ZIPs we create, so it may be better to turn ZIP64 off
>> by default, but I'm not sure.

>> I'm leaning towards adding a setUseZip64(boolean) method at the level of
>> ZipArchiveOutputStream and make it default to false.  This method could
>> be called in between putArchiveEntry calls to make it apply selectively
>> to indiviual entries.

> Sounds reasonable.

>> The name is totally open for debate since as it stands it sounds as if
>> you could turn off all Zip64 features which I wouldn't want to do for
>> the cases that can be dealt with transparently.  Then again it could use
>> a Boolean argument with "null" meaning "do the best you can" and false
>> "don't even use Zip64 if you think it is safe".

> I don't get what you mean by "do the best you can."  Does that mean
> turn it on when needed if somehow you know it is needed, per entry,
> I assume?

Actually I was thinking about what the method would mean for the other
combinations as well.  A "null" value doesn't make sense for the
specific case I'm asking about - here we need to decide what the default
should be: Zip64 or not.  For the other cases "null" could mean "use
Zip64 if you think you need it" i.e. what is omplemented right now,
"true" could mean "always use it" and "false" could mean "never use it,
throw an exception if you recognize it would be required".

> Libraries that try to be too smart tend to be hard on both users and
> maintainers,

Completely agreed.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message