james-mime4j-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefano Bagnara (JIRA)" <mime4j-...@james.apache.org>
Subject [jira] Created: (MIME4J-186) QP encoding and dotstuffing issues (let's encode the "." to "=2E")
Date Wed, 26 Jan 2011 10:36:44 GMT
QP encoding and dotstuffing issues (let's encode the "." to "=2E")

                 Key: MIME4J-186
                 URL: https://issues.apache.org/jira/browse/MIME4J-186
             Project: JAMES Mime4j
          Issue Type: Improvement
          Components: dom
    Affects Versions: 0.7
            Reporter: Stefano Bagnara
            Priority: Minor
             Fix For: 0.7

There are non compliant SMTP/POP tools/gateway/filters out there doing bad stuff with dot

I send trackable emails and I have trackable urls with "." in their path: I estimated that
when a "." ends up at the end or at the beginnning of a line (in the qp encoded html part)
between 0.5% and 1% of recipients receive a bad url (having ".." instead of "." or  having
the stripped ".")

I identified at least AVG spam filter show a bad behaviour when filtering spam for generic
mail client (but not when used with outlook). It seems that AVG intercept the tcp connection
and does its own stuff and this way it breaks when dots are at the beginning or end of a line.

Of course the example is about an url because it is the one I'm able to monitor and to have
statistical evidence, but this happens with any DOT in the message, even in text plain parts.
You understand that having the message "altered" also break dkim/gpg signatures.

One way to fix this is to change the Quoted Printable encode to make sure to encode also the
DOTs. This make the QP encoded part a bit less readable (who reads them manually today??)
but it protect the stream from uncompliant mail agents. We could even encode the "." only
when it is the first or the last of the line but I think it would be a "weird" behaviour so
I propose to simply add the "." to the list of chars to be encoded.

I think that this could be the new behaviour and that making this configurable is a bit "over-configurability",
but if people things it should be configurable then please propose a way to configure the
behaviour. The RFC give us freedom with this regard (specify what chars HAVE TO be encoded
and what MAY be left unencoded, but one could even encode every single char).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message