httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Hartill <>
Subject Re: killing robots
Date Mon, 09 Feb 1998 16:06:01 GMT
On Mon, 9 Feb 1998, Paul Sutton wrote:

> Umm, is being attacked by a nasty robot. None of the
> other vhosts we have are affected though. Perhaps it doesn't like apache?
> Just thought I'd let you know in case it is attacking other apache-related
> sites. 
> We got 170,000 hits from it last week (fairly noticeable since we normally
> only get 40,000 or so). It is coming from
> ( with a UA of "GETWWW-ROBOT/2.0".

'GETWWW' is on my list of UA substrings to reject outright.

> We are also getting a few hits from another robot-like thing: from
> ( with UA "Java1.1.3" (there is also
> a Java1.1.4 agent out there, but that has only made a few requests). The

'Java1' and 'Java3' are also on the list.

> robot seems particularly broken -- we use multiviews on every request, but
> Java1.1.3 seems to always add a trailing / unless the link contained an
> extension, then it tries without the /.
> Anyway, what's the current wisdom on how to deal with robots?

catch them early and block them forever.

> Do you match
> its UA & IP, then reject with a 404 or 500, or just trash the whole IP?

blocking UAs is best if they don't pretent to be Mozilla, well actually
it's also safe to block ^Mozilla/3.0$ ^Mozilla/4.0$ ^Mozilla/4.03$
because they are also badly behaved robots trying to spoof servers into
treating them with the respect they don't deserve.

The best line of defence against the worst offenders is a lower level
packet dropper (e.g. ipfw for FreeBSD). Lots of robots don't appreciate
that 'no' really does mean 'no'.

> I
> haven't really kept up with the robot wars, so any advice would be useful. 
> Is there a good site which tracks nasty robot issues?

We used to keep an alert list but it consumes more time reporting the
offenders than they are worth.

Reaction time is the key to saving your server or diskspace from getting
toasted. I run a perl script to count IP hits on the tail end of the
access log every 15-30m. Anything that crosses preset values for rate/volume
triggers an email warning. have been known to send 1 email message per unwanted request
to site admins when they ignore earlier requests to clean up their act.


Lincoln Stein <lstein@W3.ORG> is writing an article on bad UAs, you might
want to ping him for pointers to any info he may be willing to share.

View raw message