httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier ...@ice-sa.com>
Subject Re: URL scanning by bots
Date Wed, 01 May 2013 12:22:29 GMT
Dirk-Willem van Gulik wrote:
> On 1 mei 2013, at 13:31, Graham Leggett <minfrin@sharp.fm> wrote:
>> The evidence was just explained - a bot that does not get an answer quick enough
gives up and looks elsewhere.
>> The key words are "looks elsewhere".
> 
> 
> For what it is worth - I've been experimenting with this (up till about 6 months ago)
on a machine of mine. Having the 200, 403, 404, 500 etc determined by an entirely unscientific
'modulo' of the IP address. Both on the main URL as well as on a few PHP/plesk hole URLs.
And have ignored/behaved normal for any source IP that has (ever) fetched robot.txt from the
same IP masked by the first 20 bits.
> 
> That showed that bot's indeed slowdown/do-not-come back so soon if you give them a 403
or similar - but I saw no differences as to which non 200 you give them (not tried slow reply
or no reply). Do note though that I was focusing on naughty non-robot.txt fetching bots.
> 
For what it's worth also, thank you.

This kind of response really helps, even if/when it would contradict the proposal that I 
am trying to push.  It helps because it provides some *evidence* which I am having 
difficulties collecting by myself, and which would allow to *really* judge the proposal on

its merits, not just on unsubstantiated opinions.

At another level, I would add this : if implementing my proposal turns out to have no 
effect, or a very small effect on the Internet at large, but effectively helps the server

where it is active to avoid some of these scans, then I believe that considering the ease

and very low cost of implementing this proposal, it would still be worth the trouble.


Mime
View raw message