tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: Tomcat access log reveals hack attempt: "HEAD /manager/html HTTP/1.0" 404
Date Wed, 17 Apr 2013 18:39:07 GMT
chris derham wrote:
>> Yes.  But someone *does* own the botted computers, and their own
>> operations are slightly affected.  I have wondered if there is some
>> way to make a bot so intrusive that many more owners will ask
>> themselves, "why is my computer so slow/weird/whatever?  I'd better
>> get it looked at.  Maybe I should install a virus scanner."
> Somebody said earlier in the thread (sorry but I can't be bothered to
> find the exact quote and attribute it) something along the lines of
> "this is an arms race". The current bot software may not be there yet,
> but it is easy to see how the bot-net developers would like to have
> the job of probing IPs distributed over the botnet, so each target
> only receives a single call from each distinct IP, but together the
> 10,000 members of the bot-net each send one probe creating a full
> probe of known weak points in the server. The net result would be a)
> very hard to detect/defend against b) the proposal would not have a
> negative effect - you only add 1 second (or whatever value is agreed)
> to the async call time for each botnet member.

Long and thoughtful post. Thanks.
Let me just respond to the above for now.

Say you have a botnet composed of 100 bots, and you want (collectively) to have them scan

100,000 hosts in total, each one for 30 known "buggy" URLs. These 30 URLs are unrelated to

eachother; each one of them probes for some known vulnerability, but it is not so that if

one URL results in a 404, any of the others would, or vice-versa.
So this is 3,000,000 URLs to try, no matter what.
And say that by a stroke of bad luck, all of these 100,000 hosts have been 
well-configured, and that none of these buggy URLs corresponds to a real resource on any 
of these servers.
Then no matter how you distribute this by bot or in time, collectively and in elapsed 
time, it is going to cost the botnet :
- 3,000,000 X 10 ms (~ 8.3 hours) if each of these URLs responds by a 404 within 10 ms
- 3,000,000 X 1000 ms (~ 833 hours) if each of these URLs responds by a 404 delayed by 1 s
Now, you can optimise this for either elapsed time, or CPU time, or memory, or bandwidth 
using various designs.  But you cannot optimise for everything at the same time.
(If this was possible, then they would already have done it even now, no ?)
For example, you could have each bot opening 100 connections to 100 different servers, 
issue one request on each as quickly as possible, and then start polling the 100 sockets 
for a response, instead of going to sleep (or a blocking read) on each request until a 
response comes.
But then, you are using a lot more CPU time to do your polling, and you have 100 
connections open to 100 servers, so you use up more memory (and file descriptors etc).
Or you can decide to just wait, but then it is taking your botnet of 100 bots 100 times as

long to finish its complete scan of the 100,000 servers.

Without changing your footprint on each bot host, the only way to get your scan complete 
within the same time would be to increase your botnet by the same factor 100, to 10,000 
bots.  But then the 99,900 additional bots cannot do something else of course.

Some other calculations :
According to the same Netcraft site, of the 600 million websites, 60% are "Apache" (I 
guess that this includes httpd and Tomcat (or else Tomcat is in "others").
So let's imagine that among the 100,000 sites above, 60,000 run an Apache server, and that

50% of these (30,000) implement the delay scheme.
The time to scan the 100,000 would then be of the order of (30 URLs x 30,000 hosts x 1 s)

= 90,000 s ~ 250 hours (to simplify I am ignoring the other 70,000 "fast 404" ones).
That is still ~240 hours more than the baseline, and may still be enough to make the 
scanning uneconomical.

As for the other points in your post : you are right of course, in everything that you say

about how to better configure your server to avoid vulnerabilities.
And I am sure that everyone following this thread is now busy scanning his own servers and

implementing these tips.

But my point is that, over the WWW at large (*), I am willing to bet that less than 20% 
are, and will ever be, so carefully configured and verified.  And that is not going to 
change.  This means that no matter how well 100 million servers are well-configured, there

will still be 500 million that are not well-configured, and that among these there are 
enough potential bot targets to justify the scanning.  And as long as this is the case, 
the scanning will continue and it will bother *everybody* (even the 100 million servers 
which are well-configured).  That's because the bots, a priori, don't know that your 
server is well-configured, so you'll still be scanned.

(*) According to Netcraft 
In the September 2012 survey we received responses from 620,132,319 sites, a decrease of 
8M sites since last month's survey.
80% of that is about 500 million.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message