spamassassin-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randy Ramsdell <rramsd...@activedg.com>
Subject Re: new headers rule
Date Fri, 05 Nov 2010 13:28:31 GMT
Lawrence @ Rogers wrote:
> On 04/11/2010 8:11 PM, Karsten Br├Ąckelmann wrote:
>> Moving back on-list, since it doesn't appear to be personally directed
>> at me.
>>
>> On Thu, 2010-11-04 at 19:22 -0230, Lawrence @ Rogers wrote:
>>> On 04/11/2010 7:13 PM, Karsten Br├Ąckelmann wrote:
>>>> No, that requires the Subject to consist of exactly one whitespace.
>>>>
>>>> Read it out load. The ^ beginning of the string, followed by exactly 
>>>> one
>>>> whitespace char [2]. Followed by the $ end of the string.
>>> No offense, but I am a C and PHP programmer and Perl's documentation is
>>> lacking, to put it politely. Too much theory and far too few actual real
>>> world examples.
>> This is not about Perl, but Regular Expressions. The much more feature-
>> rich (and widely adopted) Perl flavor, out of all the existing variants.
>> But that's actually irrelevant in this case, cause you would need a very
>> limited sub-set only, pretty much available in any tool sporting REs.
>>
>> Any introduction to REs would do, no need to tend to the Perl docs you
>> don't like. Though it sounds like you didn't even had a look at the docs
>> I pointed you to.
>>
>>
>>> That is exactly what I am trying to match, and according to my tests, it
>>> works as expected. When the To and Subject are empty, all that's there
>>> (before the newline) is one whitespace.
>> Are you referring to the whitespace delimiter between the Header: and
>> its content? It's not part of the content.
>>
>>> What I am looking to check is a situation where both the To: and
>>> Subject: headers contain nothing at all, but are set (I've seen this in
>>> several spam e-mails recently)
>> Now you're confusing me. Do you want to match a single whitespace, or a
>> completely empty header?
>>
>>
>>> If there's a better way of doing this, I would appreciate you providing
>>> an example.
>> Well, better way... One that does what you just described.
>>
>> Assuming you want to match "headers containing nothing at all", as per
>> your previous paragraph. That would be nothing between the beginning and
>> end.
>>    header __FOO  Foo =~ /^$/
>>
>> Or, negated, not anything.
>>    header __FOO  Foo !~ /./
>>
>> Now, since you specifically constrained this, you might want to check
>> for the header's existence. Probably not worth it, though. The following
>> is copied from stock 20_head_tests.cf, and documented in SA Conf.
>>    header __HAS_SUBJECT  exists:Subject
>>
>>
>> Anyway, in cases like these it's best to provide a *raw* sample, showing
>> the headers in question completely un-munged and exactly as seen by SA.
>> (Otherwise our help often is limited to guessing and an informal
>> description.) This prohibits copy-n-paste from your MUA, which too often
>> changes subtle but important details.
>>
>> One easy way to come to a conclusion whether you want to match
>> whitespace or not, is the following ad-hoc header rule with spamassassin
>> debug. The matching header's contents are shown in double quotes.
>>
>>    spamassassin -D --cf="header FOO To =~ /^.*/"<  msg  2>&1 | grep FOO
>>
>> And just for reference, 'grep' uses REs...
>>
>>
> Thanks Karsten,
> 
> One of these days when I get some free time, I will be sitting down and 
> reading up on REs :)
> 
> Using your examples, and some hackery, I came up with this. It checks 
> for the existence of the To header as well, as SA doesn't seem to have a 
> rule for doing this on it's own  (a grep -r "exists:To" * on the rules 
> pulled in from updates.spamassassin.org produced nothing).
> 
> # Message has empty To: and Subject: headers
> # Likely spam
> header __LW_HAS_TO exists:To
> header __LW_EMPTY_TO To =~ /^$/
> header __LW_EMPTY_SUBJECT Subject =~ /^$/
> meta LW_EMPTY_SUBJECT_TO (__HAS_SUBJECT && __LW_HAS_TO && 
> __LW_EMPTY_SUBJECT && __LW_EMPTY_TO)
> describe LW_EMPTY_SUBJECT_TO Message has empty To and Subject headers
> score LW_EMPTY_SUBJECT_TO 2.5
> 
> I added this to my custom .cf rules file and ran spamassassin --lint and 
> got no complaints. I ran it over a sample spam, and it hit. I took 
> another spam where both headers had information in them, and it didn't 
> hit. Guess it works as expected :)
> 
> Cheers,
> Lawrence

Am I missing something?'

[29480] dbg: check: tests=AWL,BAYES_99,MISSING_SUBJECT

<snip>

Content-Transfer-Encoding: quoted-printable
Subject:
X-MB-Message-Source: WebUI

</snip>


Mime
View raw message