httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Boyle Owen" <Owen.Bo...@six-group.com>
Subject RE: [users@httpd] 404's to robots.txt?
Date Wed, 22 Jul 2009 08:02:32 GMT
> -----Original Message-----
> From: Evan Platt [mailto:evan@espphotography.com] 
> Sent: Wednesday, July 22, 2009 1:56 AM
> To: users@httpd.apache.org
> Subject: [users@httpd] 404's to robots.txt?
> 
> So I've noticed quite a lot of connections from web spider programs. 
> I've had a robots.txt
> (User-agent: *
> Disallow: /)  For a long time. But looking closer in my apache logs, 
> am I reading right that it's giving a 404?

Yes.

How many VHs do you have? If you have robots.txt in one VH but the
request comes into another VH, then you will get a 404. Maybe put
%{Host}i into the log format to see the Host header sent by the client..

Rgds,
Owen Boyle
Disclaimer: Any disclaimer attached to this message may be ignored. 

> 
> 65.55.106.173 - - [21/Jul/2009:09:44:43 -0700] "GET /robots.txt 
> HTTP/1.1" 404 208 "-" "msnbot/2.0b 
> (+http://search.msn.com/msnbot.htm)"
> 65.55.106.112 - - [21/Jul/2009:10:11:43 -0700] "GET /robots.txt 
> HTTP/1.1" 404 208 "-" "msnbot/2.0b 
> (+http://search.msn.com/msnbot.htm)"
> 65.55.106.166 - - [21/Jul/2009:11:03:35 -0700] "GET /robots.txt 
> HTTP/1.1" 404 208 "-" "msnbot/2.0b 
> (+http://search.msn.com/msnbot.htm)"
> 65.55.106.160 - - [21/Jul/2009:11:09:07 -0700] "GET /robots.txt 
> HTTP/1.1" 200 28 "-" "msnbot/2.0b (+http://search.msn.com/msnbot.htm)"
> 65.55.106.180 - - [21/Jul/2009:11:35:34 -0700] "GET /robots.txt 
> HTTP/1.1" 404 208 "-" "msnbot/2.0b 
> (+http://search.msn.com/msnbot.htm)"
> 
> Same day, no changes made:
> X.X.X.X - - [21/Jul/2009:16:47:44 -0700] "GET /robots.txt HTTP/1.1" 
> 304 - "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; 
> rv:1.9.1.1) Gecko/20090715 Firefox/3.0.7, Ant.com Toolbar 1.3 (.NET 
> CLR 3.5.30729)"
> Z.Z.Z.Z- - [21/Jul/2009:16:49:10 -0700] "GET /robots.txt HTTP/1.1" 
> 200 28 "-" "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) 
> AppleWebKit/530.5 (KHTML, like Gecko) Chrome/2.0.172.30 Safari/530.5"
> 
> Two different IP's. One myne, one a friends.
> 
> Any suggestions as to why (if I'm reading the log right) I'm handing 
> out a 404 to it appears just web crawlers?
> 
> # httpd -v
> Server version: Apache/2.2.3
> Server built:   Jun 16 2009 11:28:50
> 
> Don't know what other information is needed to help troubleshoot... 
> Running on a os//x box.
> http://www.espphotography.com/robots.txt if you want to take a look...
> 
> Thanks. :)
> 
> Evan
> 
> 
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP 
> Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>    "   from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
> 
> 
 
This message is for the named person's use only. It may contain confidential, proprietary
or legally privileged information. If you receive this message in error, please notify the
sender urgently and then immediately delete the message and any copies of it from your system.
Please also immediately destroy any hardcopies of the message. 
The sender's company reserves the right to monitor all e-mail communications through their
networks.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Mime
View raw message