spamassassin-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandre Boyer <>
Subject Re: HTML link regex
Date Thu, 27 Sep 2012 14:41:34 GMT
Hello all,

Here is a small ruleset that I'm working with. I added it to our local
ruleset in prod:


    # Canada Post

    uri_detail   AJB_CANPOST_BADLINK             raw !~ /canadapost\./
    text =~ /(?:https?:\/\/(?:www\.)?|www\.)canadapost\./ type =~ /^a$/
    describe     AJB_CANPOST_BADLINK             Found a mismatch
    between href and anchored text pretending to link to
    score        AJB_CANPOST_BADLINK             1.0
    describe     AJB_CANPOST_PHISH_BADTRACKNUM   Mismatch between href
    and anchored + unofficial tracking number from CanadaPost
    score        AJB_CANPOST_PHISH_BADTRACKNUM   2.0

    uri_detail   AJB_UTUBE_BADLINK   raw !~ /youtube\./ text =~
    /(?:https?:\/\/(?:www\.)?|www\.)youtube\./ type =~ /^a$/
    describe     AJB_UTUBE_BADLINK   Found a mismatch between href and
    anchored text pretending to link to
    score        AJB_UTUBE_BADLINK   0.5
    # because of link trackers (from massmailer for example), we must
    meta this with other rulz to be sure we face our fake yutube botnet

    describe  AJB_FK_UTUBE_BOTNET     mismatch between href and anchored
    + empty subject = botnet
    score     AJB_FK_UTUBE_BOTNET     5.5

    # TODO: check if we could workwith  DKIM, exists:List-Unsubscribe,
    #    in order to avoid FPs from MassMailers.

Note the TODO ;-)

I will work on this later on.

A question here: as per,
one can create a sandbox to submit rules.

Is it open to anybody? Should I do this to submit those rules (and more,
some I already have on my side and future ones)?

Could anyone from the list give me a primer about this process? I'm
interested (with my boss approval) in becoming a regular contributor but
this part of SA project is a little cryptic to me right now.

Do not hesitate to contact me off-list if necessary.

Alex, from prypiat.
Yes, I recycle.

On 12-09-26 11:03 AM, Bowie Bailey wrote:
> On 9/26/2012 10:45 AM, Alexandre Boyer wrote:
>> Hi all,
>> Me happy :-D
>> It works as expected for simple rules.
>> For example, to get rid off my problem with youtube links I had this
>> simple rule:
>>     uri_detail   Z_URIDETAIL_UTUBE_SPOOF   raw !~ /youtube\./ text =~
>>     /(https?://)?(www\.)?youtube\./ type =~ /^a$/
>>     score        Z_URIDETAIL_UTUBE_SPOOF   10.0
> The alternatives on the text regexp are irrelevant.  An equivalent
> simpler regexp would be:
> text =~ /youtube\./
> Any optional text at the beginning or end of a (non-anchored) regexp
> should be left off unless you are trying to capture it for later use.

View raw message