httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Hartill <hart...@ooo.lanl.gov>
Subject robot denial
Date Thu, 18 Jul 1996 16:58:27 GMT

Well the robot folks seem to have reluctanly agreed to add "/robot"
to USER-AGENT so that servers can react to it.

It's not clear if any will do it, but that's what they'll send if they
decide to declare themselves in an easier to detect manner.

Q.  I was thinking of writing a module that allowed something like

<Directory /foo>
NoRobots on
</Directory>

and for <Location> and .htacess.

The idea is that if a dir/URL is "protected" with this, then anything
identifying itself as a robot will be denied access (403)

-=-=-=

But perhaps it might be better to extend the authoriazation stuff to
allow something like

<Limit GET>
order allow,deny
allow from all
deny agent /robot
</Limit>

which would also allow denial to individual user agents e.g.

deny agent Crawler/1.4 SiteTrasher/0.0001 Mozilla/42.42

-=-=-=-=

What do people think is the best route? I like the latter. Is it posisble
with the API to write "deny agent" as a module? or is it a patch job?


rob

Mime
View raw message