commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Neidhart <thomas.neidh...@gmail.com>
Subject Re: [email] Problem parsing EMail-Attachmentfilename (ISO-8859-15)
Date Fri, 02 Aug 2013 14:25:11 GMT
Dear Olaf,

it looks like that the filename in the data source is not properly
extracted.
Your change would work for ISO-8859-15 encoded filenames but maybe not for
other encodings, so we should fix this in a general way.

Could you please create an issue on the bug tracker
https://issues.apache.org/jira/browse/EMAIL and attach an example raw mail
which illustrates the problem?

Thanks,

Thomas



On Fri, Aug 2, 2013 at 3:48 PM, Olaf Kaus <olaf.kaus@gmail.com> wrote:

> Hi,
> first: Thank you for your good work!
>
> I use common-email-1.3.1 to parse emails from a imap-server.
> After parsing an email with an pdf-attachment I received the following
> attachment-filename:
> ISO-8859-15''%5A%E4%68%6C%65%72%73%74%61%6E%64%73%6D%69%74%74
> But the filename should be “Zählerstandsmitteilung_06_13.pdf”.
>
> I discovered the sourcecode and change the method
> MimeMessageParser.getDataSourceName() as follows:
>
>
> protected String getDataSourceName(Part part, DataSource dataSource) throws
> MessagingException, UnsupportedEncodingException {
> String result = dataSource.getName();
>
> if (result == null || result.length() == 0) {
> result = part.getFileName();
> }
>
> if (result != null && result.length() > 0) {
> result = MimeUtility.decodeText(result);
> } else {
> result = null;
> }
> // NEW-Start
> // result could be =
> ISO-8859-15''%5A%E4%68%6C%65%72%73%74%61%6E%64%73%6D%69%74%74
> if (result.indexOf("%") != -1) {
> String rawContentType = part.getContentType();
> // extract the name from contenttype:
>
> application/pdf;\n\rname="=?ISO-8859-15?Q?Z=E4hlerstandsmitteilung=5F06=5F13=2Epdf?="
> int nameIndex = rawContentType.indexOf("name=\"");
> if (nameIndex != -1) {
> rawContentType = rawContentType.substring(nameIndex);
> rawContentType = rawContentType.substring(rawContentType.indexOf('"') + 1,
> rawContentType.lastIndexOf('"'));
> // ISO-Decoding
> if (rawContentType.startsWith("=?") || rawContentType.endsWith("?=")) {
> result = MimeUtility.decodeText(rawContentType);
> }
> }
> }
> // NEW-END
>
> return result;
> }
>
> Please, can you check the solution and maybe fix this behavior in the next
> version.
>
> Thanks in advance
> Olaf K.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message