tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Fischer <>
Subject Re: howto stop crawler and bots according to their user agent string
Date Tue, 15 Jul 2008 12:27:05 GMT
Hello Mathias,

Mathias Walter wrote:
> I don't know it exactly. The problem is that the sites are linked from
> anywhere. I'm not sure, if a crawler that follows the link
> http://mydomain:port/servlet/page.jsp, looks for the robots.txt in the ROOT
> webapp.

Just last week we've installed a robots.txt where none was before and it 
took more than 24 hours until most bots (GoogleBot, MsnBot, Yahoo Slurp) 
re-read it again. Yes, Bots are supposed to read the file /robots.txt 
when accessing /some/thing/here.jsp . If it's a malicious bot, you're 
out of luck anyway.

- Markus

To start a new topic, e-mail:
To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message