httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Hartill <hart...@ooo.lanl.gov>
Subject Re: Log Ignore Hosts patch uploaded
Date Sun, 25 Feb 1996 14:05:49 GMT
 
> > bozo robots?
> 
> On that topic, anyone have any cute ideas to fight this scourge?

We keep getting hit hard by ignorant robots. Early warning log monitoring
is useful - once you trap a bozo robot you can block access with "deny"
or (as we have done) add a check to all our perl script to intercept
the bozo robots and send them something more interesting.

Our first line of defense is our "nph-bozo" redirect
 
   http://xxx.lanl.gov/cgi-bin/nph-bozo

that sucks the suckers into a looooong wait. When we have an address for
the bozo robot owner that script also mailbombs them a log of all
requests they've made to our server for each new request.

Some bozo robots attack in parallel and mail gets bounced or ignored so,
another tactic is needed. Either redirect the bozo robot back to it's own
site (in the hope they start indexing themselves), or download a page from
their site that contains lots of links to themselves and feed them that.
Another option is to send them a constant stream of data in the hope that
the disk at the other end fills up and takes the robot with it (ahhh revenge)

Oh, and if you do manage to talk to the owner, make sure you rip them
to shreds and hurl every insult under the sun at them. Here's one I sent
last night....

> Hi,
> 	I've stopped all further access to your site. Just one thing: My 
> logs show that 0 bytes of data was returned from your robots.txt. I tried 
> it manually, and got the right file from one subnet and 0 bytes from 
> another. Any ideas? Our intention was not to do something "great" by 
> bypassing the Robot exclusion protocol, which our robot fully complies.
> 
> Rupesh

Your stupid robot does not fully comply with exclusion protocol as
is obvious from the mindless approach it took as it trashed our server
and the mindless approach its mismanager took by leaving unattended
such a shoddy piece programming.

We have access logs that go back over 2 years. They show that your
stupid robot requested and was sent full contents of the robots.txt
file. Your stupid robot chose to ignore explicit requests not to
access specific areas of our server.

Not only did your stupid robot break those rules but it also did it
it duplicate.

Should your stupid*2 robot ever make the stupid mistake of returning
to our site again our response will be swift and appropriate.

Please pass these comments on to all the stupid people responsible
for your stupid robot.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

if nothing else, it make you feel better.

rob

Mime
View raw message