commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Bodewig <bode...@apache.org>
Subject Re: [compress] [PATCH] Refactoring of zip encoding support.
Date Tue, 03 Mar 2009 16:11:24 GMT
On 2009-03-03, Wolfgang Glas <wolfgang.glas@ev-i.at> wrote:

> Stefan Bodewig schrieb:
>> On 2009-03-02, Wolfgang Glas <wolfgang.glas@ev-i.at> wrote:

>>> Stefan Bodewig schrieb:
>>>> On 2009-03-01, Wolfgang Glas <wolfgang.glas@ev-i.at> wrote:

>>>>> 1) Unicode extra fields are written for all ZIP entries and not only
>>>>> for entries, which are not encodable by the encoding set to
>>>>> ZipArchiveOutputStream.

>>>> Maybe room for yet another flag?  Or an enum-like option

>>>> setCreateUnicodeExtraFields(NEVER | ALWAYS | NOT_ENCODABLE)

>> Consider the WinZIP case, WinZIP wouldn't recognize the EFS.  If you
>> set the encoding to UTF-8 and use your code and only add extra fields
>> for non-encodable paths, WinZIP will never see the correct path.

> Acccording to my tests WinZip recognizes the EFS flag upon
> reading.

Then my documenation is wrong 8-)

> Secondly, if you set the encoding to UTF-8, there's no need for
> unicode extra fields anyway.

Except when your client doesnt recognize the EFS flag and thinks you'd
be using CP437 - but happily accepts the Unicode extra fields.  I
thought this would be the case for WinZIP.

>> but looking at the names we may be better off with two independent
>> options.  Hmm, yes, right now I prefer two flags because they seem to
>> be orthogonal.

> I think you should choose, which approach better fits your needs in
> ant ;-) At least you have to write an XML parser for these settings

You vastly overestimate the effort it takes to write an Ant task.

http://svn.apache.org/viewvc/ant/core/trunk/src/main/org/apache/tools/ant/taskdefs/Zip.java?r1=738330&r2=748593

is all I had to do for the two existing options.

> and the documentation, so you might choose the approach which may be
> explained in brief words.

> I can live very well with two options ;-)

If you throw in "fallbacks" we are actually facing three concepts.

OK, this is what I feel makes most sense:

createUnicodeExtraFields = NEVER (default) | ALWAYS | NOT_ENCODABLE
useLanguageEncodingFlag = true (default) | false
fallbackToUtf8 = true | false

I'm not sure about the default for the later, probably
default fallbackToUtf8 = (createUnicodeExtraFields == NEVER)

Unfortunately I don't really see how we can merge all permutations
into meaningful names otherwise.  But suggestions are welcome.

Stefan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message