jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julian Reschke <julian.resc...@gmx.de>
Subject Re: General Packaging mechanism
Date Wed, 21 Feb 2007 14:19:35 GMT
Tobias Bocanegra schrieb:
>> Unrelated to that...:
>>
>> > - use a standard format for the archive (i.e. zip/jar)
>>
>> If you use ZIP/JAR as format, how are you going to handle non-ASCII
>> characters in filenames in a portable way?
> 
> all non-valid filesystem characters are escaped using url-escaping %xx
> or %uXXXX. actually i haven't looked at how non-ascii characters are
> handled in a jar file, but obviously it works, since i can include
> such a file in a jar file:
> ...

My understanding is that the JAR/ZIP format is silent on filename
encoding. So a producer if these files will have to select an encoding,
and the recipient need to select the same one. In general, this is not
going to work unless everybody agrees to use UTF-8.

> [tripod@sulu test]$ touch "到日本来.txt"
> [tripod@sulu test]$ ll
> total 0
> -rw-rw-r-- 1 tripod tripod 0 Feb 21 14:44 到日本来.txt
> [tripod@sulu test]$ cd ..
> [tripod@sulu jcr-car]$ jar cvf test.jar test/
> added manifest
> adding: test/(in = 0) (out= 0)(stored 0%)
> adding: test/到日本来.txt(in = 0) (out= 0)(stored 0%)
> [tripod@sulu jcr-car]$ jar tf test.jar
> META-INF/
> META-INF/MANIFEST.MF
> test/
> test/到日本来.txt

I think it's using the platform encoding, and you happened to try a
character (can't tell from your mail) that can be represented in that
encoding. Try a mix of special character (Euro sign, Hebrew, Arabic) in
one filename, and retry :-)

Best regards, Julian

(P.S.: we had trouble using ZIP as a content container format two years
ago for the reasons above; maybe the situation has improved but I really
doubt that)


Mime
View raw message