infra-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Querna (JIRA)" <>
Subject [jira] Commented: (INFRA-1578) Allow GoogleCodeBot in robots.txt
Date Wed, 09 Apr 2008 05:17:24 GMT


Paul Querna commented on INFRA-1578:

IMO, we should wait until the European SVN Mirror is online. (Machines are being installed
in the DC this week)

At that time, I would prefer that we just enable all crawlers in robots.txt.

Speaking as a former employee of a Google competitor, I don't believe we should give google
any special case in this regards :P

If we have problems with ViewVC, which is why the block was added originally, we should make
the disallow specific to it, and keep the raw SVN repos open.

> Allow GoogleCodeBot in robots.txt
> ---------------------------------
>                 Key: INFRA-1578
>                 URL:
>             Project: Infrastructure
>          Issue Type: Wish
>      Security Level: public(Regular issues) 
>          Components: Website
>            Reporter: Mike Aizatsky
> Hello,
> We, at google, has received quite a few complaints about Apache
> software source code being unavailable on Google Code Search
> ( We've investigated the issue, and
> found that you have a robots.txt file disallowing even our special
> google code crawlers (
> User-agent: *
> Disallow: /
> We do believe this was done to tell usual web crawlers to stay away
> from your svn repositories, but we have a custom,
> svn-interface-conformant crawler in codesearch. Can you relax your
> robots.txt for us and allow "GoogleCodeBot" to index your site? Or if
> you're reluctant to change your file, can you just confirm that we're
> free to index your source code?
> --
> Regards,
> Mike

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message