tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Fischer <mar...@fischer.name>
Subject Re: howto stop crawler and bots according to their user agent string
Date Tue, 15 Jul 2008 12:27:05 GMT
Hello Mathias,

Mathias Walter wrote:
> I don't know it exactly. The problem is that the sites are linked from
> anywhere. I'm not sure, if a crawler that follows the link
> http://mydomain:port/servlet/page.jsp, looks for the robots.txt in the ROOT
> webapp.

Just last week we've installed a robots.txt where none was before and it 
took more than 24 hours until most bots (GoogleBot, MsnBot, Yahoo Slurp) 
re-read it again. Yes, Bots are supposed to read the file /robots.txt 
when accessing /some/thing/here.jsp . If it's a malicious bot, you're 
out of luck anyway.

HTH,
- Markus

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message