commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wolfgang Glas <wolfgang.g...@ev-i.at>
Subject Re: [compress] State of encoding support in ZIP package
Date Sun, 01 Mar 2009 21:53:48 GMT
Stefan Bodewig schrieb:
> On 2009-02-27, Wolfgang Glas <wolfgang.glas@ev-i.at> wrote:
> 
>> Additionally, my experience with WinZip shows, that WinZip writes weird
>> filenames to the single-byte version of the filename when a unicode field is
>> present.
> 
> Hmm, native encoding I'd guess.

Sth like this, looks like they are writing the LSB of a 2-byte value...

> Wolfgang, could you do me a favor and please review what I've written
> for the Ant zip task manual page in svn revision 748593
> <http://svn.apache.org/viewvc?view=rev&revision=748593>, in particular
> <http://svn.apache.org/viewvc/ant/core/trunk/docs/manual/CoreTasks/zip.html?r1=748593&r2=748592&pathrev=748593>?

Seems quite OK ;-)

The one thing, I'd like to discuss is the semantics of the useEFS flag in
ZipArchiveOutputStream:

My understanding from previous discussion was, that we need a mode, where file
names not encodable by the chosen encoding are encoded in UTF-8, which is in
turn indicated by setting the EFS flag on the likewise ZIP entry. (That's the
way 7-zip handles unicode filenames...)

The current implementation of the useEFS flag simply allocs to disable the
creation of the UFS flag in ZIP entries, which are UTF-8. This approach is not
conformant with the specifiations I've read and I have not seen a single zip
implementation, which is disturbed by the EFS flag.

My opinion would be to simply drop the possibility to inhibit the EFS flag in
utf-8 encoded files and to introduce a new flag allowing to switch to utf-8
fallbacks (7-zip mode...).

What other opinion are out there?

  Wolfgang

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message