httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dgau...@hotwired.com (Dean Gaudet)
Subject Re: No HOST header solutions?
Date Tue, 04 Jun 1996 20:13:38 GMT
In article <hot.mailing-lists.new-httpd-9606022310.aa20391@paris.ics.uci.edu>,
Roy T. Fielding <new-httpd@hyperreal.com> wrote:
>> I guess robots.txt is from pre-CGI days... the "Useragent" thing
>> seems more than a little useless.
>
>That's probably because you don't use MOMspider to help maintain
>your site.  If you did (or used other benign robots), then it would
>be clear as to why different robots would be assigned different
>restrictions.

If your robots.txt is a CGI you don't need to put any UserAgent in the
output other than "UserAgent *" since you can code up anything you want
using any of the env variables passed to you.  It gives you more things to
"control" robots by.

http://hard.hotwired.com/robots.txt gives back a disallow /, whereas
http://www.hotwired.com/robots.txt lets the entire site be indexed.
I also peek at the HTTP_HOST variable... but haven't yet started asking
search engines to support it.  A few "personal robots" support it already.
This would let me prohibit indexing of "http://hard.www.hotwired.com/".

hard.www, and hard are different addresses.  The former is virtual,
and can be served by any of my working webservers but defaults to hard
if hard is alive (and ready to serve).  The latter is physical and is
only available when the machine named hard is alive.  It's nothing fancy
-- the router is responsible for directing the www.hotwired.com addresses
to a working machine.  All three servers have loopback aliases for all
three of the addresses.

Dean

Mime
View raw message