commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wolfgang Glas <wolfgang.g...@ev-i.at>
Subject Re: [compress] [PATCH] Refactoring of zip encoding support.
Date Tue, 03 Mar 2009 20:28:31 GMT
Stefan Bodewig schrieb:
> On 2009-03-03, Wolfgang Glas <wolfgang.glas@ev-i.at> wrote:
> 
>> Stefan Bodewig schrieb:
>>> On 2009-03-02, Wolfgang Glas <wolfgang.glas@ev-i.at> wrote:
> 
>>>> Stefan Bodewig schrieb:
>>>>> On 2009-03-01, Wolfgang Glas <wolfgang.glas@ev-i.at> wrote:
> 
>>>>>> 1) Unicode extra fields are written for all ZIP entries and not only
>>>>>> for entries, which are not encodable by the encoding set to
>>>>>> ZipArchiveOutputStream.
> 
>>>>> Maybe room for yet another flag?  Or an enum-like option
> 
>>>>> setCreateUnicodeExtraFields(NEVER | ALWAYS | NOT_ENCODABLE)
> 
>>> Consider the WinZIP case, WinZIP wouldn't recognize the EFS.  If you
>>> set the encoding to UTF-8 and use your code and only add extra fields
>>> for non-encodable paths, WinZIP will never see the correct path.
> 
>> Acccording to my tests WinZip recognizes the EFS flag upon
>> reading.
> 
> Then my documenation is wrong 8-)

Sorry for not exactly reading the Documentation, but I got stuck because the EFS
flag seemed to be not enough for me and I wanted to get this straight before.
But I think we've come a long way and the end i near ;-)

>> Secondly, if you set the encoding to UTF-8, there's no need for
>> unicode extra fields anyway.
> 
> Except when your client doesnt recognize the EFS flag and thinks you'd
> be using CP437 - but happily accepts the Unicode extra fields.  I
> thought this would be the case for WinZIP.

Yes, the EFS flag is of little usefulness. It has been added very late to Specs
and most implementation ignore it right away. Hence thes introduced extra fields
and now we have to live with both 8-)

>>> but looking at the names we may be better off with two independent
>>> options.  Hmm, yes, right now I prefer two flags because they seem to
>>> be orthogonal.
> 
>> I think you should choose, which approach better fits your needs in
>> ant ;-) At least you have to write an XML parser for these settings
> 
> You vastly overestimate the effort it takes to write an Ant task.
> 
> http://svn.apache.org/viewvc/ant/core/trunk/src/main/org/apache/tools/ant/taskdefs/Zip.java?r1=738330&r2=748593
> 
> is all I had to do for the two existing options.

That nice, however, I hope that I can avoid adding ant to the list of OS-project
I participate in ;-)

>> and the documentation, so you might choose the approach which may be
>> explained in brief words.
> 
>> I can live very well with two options ;-)
> 
> If you throw in "fallbacks" we are actually facing three concepts.
> 
> OK, this is what I feel makes most sense:
> 
> createUnicodeExtraFields = NEVER (default) | ALWAYS | NOT_ENCODABLE
> useLanguageEncodingFlag = true (default) | false
> fallbackToUtf8 = true | false

Agreed ;-)

> I'm not sure about the default for the later, probably
> default fallbackToUtf8 = (createUnicodeExtraFields == NEVER)

The default for the later should be false, it is a special option for people who
now, what they are doing.

The implementation should be be straightforward, shall I prepare a patch or can
you afford doing it at your own?

  Regards,

    Wolfgang


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message