spamassassin-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philip Prindeville <philipp_s...@redfish-solutions.com>
Subject Re: Funky HARP Spam
Date Fri, 27 Jun 2014 01:02:42 GMT

On Jun 25, 2014, at 5:29 PM, RW <rwmaillists@googlemail.com> wrote:

> On Wed, 25 Jun 2014 14:21:33 -0600
> Philip Prindeville wrote:
> 
> 
>> Here’s the other thing I don’t get.
>> 
>> The message claims to be 7-bit and text/plain, yet it uses encoded
>> characters which exceed 7-bit widths yet this doesn’t seem to be
>> firing any rules either.
>> 
>> &#x042C would seem to be at least an 11-bit wide character.
> 
> You are mixing-up different levels of encoding. The characters
> &,#,x,0,4,2 and C are all 7-bit ASCI, and so are consistent with
> Content-Transfer-Encoding: 7bit.

You’re correct… That is consistent with the CTE.

But the Content-Type omitted a ;charset=“XXX” attribute, which means it defaults to “US-ASCII”.

Quoting RFC-2046:

4.1.2.  Charset Parameter

   A critical parameter that may be specified in the Content-Type field
   for "text/plain" data is the character set.  This is specified with a
   "charset" parameter, as in:

     Content-type: text/plain; charset=iso-8859-1

   Unlike some other parameter values, the values of the charset
   parameter are NOT case sensitive.  The default character set, which
   must be assumed in the absence of a charset parameter, is US-ASCII.


Since &#x042C is outside the US-ASCII character set, this would be an encoding violation.

-Philip



> 
> The previous mime section is more problematic since it appears to
> contain 8-bit data. 


Mime
View raw message