forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ross Gardler <>
Subject Re: [RT] crawl our dynamic forrest rather than commandline
Date Thu, 01 Sep 2005 17:21:58 GMT
Nicola Ken Barozzi wrote:
> David Crossley wrote:
>>We would rather use Forrest in dynamic mode
>>so that we do not need to worry about the
>>filename extensions in the output space and
>>take more advantage of the Cocoon facilities
>>like "Cocoon views" etc.
>>However, we must be able to produce a static
>>set of documents. That constrains us to the
>>filename extension thing.
>>Would it be possible to use an external tool
>>like "wget" or maybe Apache Ant, to crawl a local
>>Forrest server and detect the mime-types and create
>>the set of files, appending the appropriate extension?
>>That is just a wild thought, but so many times
>>i read back through our mail archives, and see
>>us hindered by this need to stick with the
>>filename extensions and limit our use of Cocoon.
>>Our design decisions are hampered.
> I'm sure (I have seen the code) that Cocoon CLI has been thought to be
> able to crawl also links with ?a=b parameters in it, although I have
> never tried it.

I have tried it, the problem is that a '?' is not legal in a file name 
on some platforms. So it gets converted to a '_' (I think, it's that 
anyway, can't remember exactly). As a result anything with a parameter 
breaks the filenames.

I have no idea how something like wget does it.


View raw message