httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Reser <>
Subject Re: URL scanning by bots
Date Tue, 30 Apr 2013 16:49:45 GMT
On Tue, Apr 30, 2013 at 3:03 AM, André Warnier <> wrote:
> Let us imagine for a moment that this suggestion is implemented in the
> Apache webservers,
> and is enabled in the default configuration.  And let's imagine that after a
> while, 20% of
> the Apache webservers deployed on the Internet have this feature enabled,
> and are now
> delaying any 404 response by an average of 1000 ms.
> And let's re-use the numbers above, and redo the calculation.
> The same "botnet" of 10,000 bots is thus still scanning 300 Million
> webservers, each bot
> scanning 10 servers at a time for 20 URLs per server.  Previously, this took
> about 6000
> seconds.
> However now, instead of an average delay of 10 ms to obtain a 404 response,
> in 20% of the
> cases (60 Million webservers) they will experience an average 1000 ms
> additional delay per
> URL scanned.
> This adds (60,000,000 / 10 * 20 URLs * 1000 ms) 120,000,000 seconds to the
> scan.
> Divided by 10,000 bots, this is 12,000 additional seconds per bot (roughly 3
> 1/2 hours).

Let's assume that such a feature gets added, however it's not likely
going to be the default feature.  There are quite a few places that
serve a lot of legitimate soft 404s for reasons that I'm not going to
bother to get into here.

Any site that goes to the trouble of enabling such a feature is
probably not going to be a site that is vulnerable to what these
scanners are looking for.  So if I was a bot writer I'd wait for some
amount of time and if I didn't have a response I'd move on.  I'd also
not just move along with the next scan on your web server, I'd
probably just move on to a different host.  If nothing else a sever
that responds to request slowly is not likely to be interesting to me.

As a result I'd say your suggestion if wildly practiced actually helps
the scanners rather than hurting them, because they can identify hosts
that are unlikely to worth their time scanning with a single request.

View raw message