spamassassin-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philip Prindeville <philipp_s...@redfish-solutions.com>
Subject Re: Dubious hyperlinks
Date Fri, 27 Jun 2014 01:35:17 GMT

On Jun 26, 2014, at 7:31 PM, John Hardin <jhardin@impsec.org> wrote:

> On Thu, 26 Jun 2014, Philip Prindeville wrote:
> 
>> On Jun 25, 2014, at 3:47 PM, John Hardin <jhardin@impsec.org> wrote:
>> 
>>> That still doesn't hit *only* the same GUID repeated. Try this:
>>> 
>>> rawbody L_REPEATING_UUIDS  /<a href="\#" [^\s>]+(;[A-F0-9]{8}-[A-F0-9]{4}-[A-F0-9]{4}-[A-F0-9]{4}-[A-F0-9]{12})\1\1\1/i
>> 
>> Sorry, that got dropped along the way.  I had tested:
>> 
>> rawbody L_REPEATING_UUIDS       /<a href="\#" .*(;[A-F0-9]{8}-[A-F0-9]{4}-[A-F0-9]{4}-[A-F0-9]{4}-[A-F0-9]{12})(\1){4,}>/i
>> 
>> and indeed that works correctly.
> 
> OK, that's certainly another valid way to code it.
> 
> Note that you do not need parens around the \1. That captures it again, which just wastes
processing.  \1{4,} should work.
> 
> Also, .* in a rawbody rule is a **really** bad idea. Note my suggested alternative, which
won't run wild scanning the entire message.


The [^\s] wouldn’t work because there is space in there…

<A href="#" philipp&nbsp;2014-06-25 01:20:00;F1B9215E-B1D0-40BC-92D1-F13D501596B7;F1B9215E-B1D0-40BC-92D1-F13D501596B7;F1B9215E-B1D0-40BC-92D1-F13D501596B7;F1B9215E-B1D0-40BC-92D1-F13D501596B7;F1B9215E-B1D0-40BC-92D1-F13D501596B7;F1B9215E-B1D0-40BC-92D1-F13D501596B7><SPAN
style="VISIBILITY: hidden"></SPAN></A>

note the name, non-breaking space, and the timestamp before the UUID’s…



Mime
View raw message