cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Upayavira">
Subject Re: [RT] Fixing the CLI
Date Tue, 04 Mar 2003 17:14:48 GMT
On Mon, 24 Feb 2003 Nicola Ken wrote:
> Traversing optimizations
> -------------------------
> As you know, the Cocoon CLI gets the content of a page 3 times.
> I had refactored these three calls to Cocoon in the methods (in call
> order):
> ....... 
> Now, with the -e option we basically don't need step 2. If done 
> correctly, this will increase the speed! :-)
> So we have two steps left: getting links and getting the page.
> If we can make them into a single step we're done.
> Cocoon has the concept of pluggable pipelines. And each pipeline is
> responsible of connecting the various components. If we used a
> pipeline that simply inserts between the source and the next
> components a pipe that records all links
> ( into the
> Enviroment, we can effectively get both the result and the links in a
> single pass.

I have gone ahead and coded my attempt at Nicola Ken's CLI traversal 
optimisation. I didn't use pluggable pipelines. Maybe I should have.

Basically, if you switch off extension checking, then it is possible to gather links 
from within the pipeline. So, I hunted out the locations in the pipeline code 
( where the LinkTranslator is added to the pipeline, and added 
an optional LinkGatherer in its place which, instead of translating links using a 
map in the ObjectModel, places the found links into a List in the ObjectModel. That 
list is then available to the CLI afterwards, which then adds the URIs to the list of 
links to be scanned. And it seems to work.

This means that so long as one does not want to confirm that extensions match 
the mime-type of the document, one can build a site generating each page only 
once, which is great.

So what now? Is anyone interested in seeing it?

Regards, Upayavira

View raw message