james-mime4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Bagnara <apa...@bago.org>
Subject Re: Decoding an attachment filename sent from Mail.app using DecoderUtil.decodeEncodedWords
Date Wed, 20 Apr 2011 13:32:21 GMT
2011/4/20 Adam Warski <adam@warski.org>:
> Hello,
>
> I am trying to decode a filename of an attachment sent with Mail.app. Originally, the
file is named "Żółw.rtf" (polish for Turtle.rtf).
> The headers are:
>
> --Apple-Mail-19-721116558
> Content-Disposition: attachment;
>   filename*=utf-8''Z%CC%87o%CC%81%C5%82w.rtf
> Content-Type: text/rtf;
>   x-unix-mode=0644;
>   name="=?utf-8?Q?Z=CC=87o=CC=81=C5=82w=2Ertf?="
> Content-Transfer-Encoding: 7bit
>
> So the corresponding javax.mail.Part.getFileName() returns "=?utf-8?Q?Z=CC=87o=CC=81=C5=82w=2Ertf?=".
>
> I tried decoding both with mime4j's DecoderUtil.decodeEncodedWords and JavaMail's MimeUtility.decodeText
but the result is: "Żółw.rtf". Clearly not the original :).
>
> For comparison, MimeUtility.encodeText returns:
> =?UTF-8?Q?=C5=BB=C3=B3=C5=82w.rtf?=
> in contrast to:
> =?utf-8?Q?Z=CC=87o=CC=81=C5=82w=2Ertf?=
> coming from the e-mail.
>
> According to my research, the letter "Ż" can be encoded in two ways: either as a single
letter or as "Z" + above-dot. MimeUtility.encodeText uses the former, Mail.app the latter.

Does this means also that when Mail.app receives a mime with the name
=?utf-8?Q?Z=CC=87o=CC=81=C5=82w=2Ertf?= it correctly saves the
attchment as "Żółw.rtf" ?? It would be weird as I'm not aware (but I'm
not an unicode guru, and that's why I ask you for this test) of utf-8
decoding where an UTF8 sequence alter the previous Ascii character and
the Z is wrote in clear ascii in that encoded string.

> Is there some way to properly decode the filename?

You can try using java.text.Normalizer. Let us now if this works.
Stefano

> Thanks!
>
> --
> Adam Warski
> http://www.warski.org
> http://www.softwaremill.eu

Mime
View raw message