cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <>
Subject Re: [C2] Link filtering and Content aggregation
Date Fri, 06 Oct 2000 09:59:55 GMT
Giacomo Pati wrote:

> > more or less.... but it's pretty easy to have an implicit "robot.txt"
> > resource directly created by Cocoon even if the file is not present
> > based on sitemap parameters.
> Yes, but (after reading it) the robot.txt spec says that there is only
> one robot.txt and the request URI is "/robot.txt" for the hole site (and
> not as a sub context like "/cocoon/robot.txt".

Ah, ok. But we can at least have it generated by CLI from the sitemap,
don't you think?

(BTW, is there a sort of)
> > > Shouldn't we express the crawl attribute to the outside by a request
> > > URI to "robot.txt"?
> >
> > exactly
> I must disagree after reading the robot.txt spec. It's not possible for
> cocoon.

> > > Or is crawling from the commandline and crawling by
> > > a spider different?
> >
> > good point, didn't think of that. what do you think?
> Using /robot.txt means writing the robot.txt by hand, deploying it into
> the root context and not specifying it in the sitemap. If we can't
> exactly simulate a command line environment (like the http environment)
> we need to distinguish between them because in fact there is no
> differnce between a spider and a browser.

Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<>                             Friedrich Nietzsche
 Missed us in Orlando? Make it up with ApacheCON Europe in London!
------------------------- http://ApacheCon.Com ---------------------

View raw message