www-infrastructure-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Turner (JIRA)" <j...@apache.org>
Subject [jira] Created: (INFRA-1946) Allow crawling of Bugzilla bugs
Date Fri, 20 Mar 2009 03:15:50 GMT
Allow crawling of Bugzilla bugs
-------------------------------

                 Key: INFRA-1946
                 URL: https://issues.apache.org/jira/browse/INFRA-1946
             Project: Infrastructure
          Issue Type: Improvement
      Security Level: public (Regular issues)
          Components: Bugzilla
            Reporter: Jeff Turner
            Assignee: Mark Thomas


Currently the issues.apache.org robots.txt disallows crawling of Bugzillas:

$ curl http://issues.apache.org/robots.txt
...
Disallow: /bugzilla
Disallow: /SpamAssassin

I think the benefits of having the data crawled justifies a bit of experimentation here. 
Perhaps we could try a variant of redhat's, specifying a crawl-delay and restricting to just
the index and show_bug.cgi:

https://bugzilla.redhat.com/robots.txt


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message