forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Turner <>
Subject Re: CLI Caching, etc
Date Mon, 18 Aug 2003 12:51:20 GMT
On Sun, Aug 17, 2003 at 04:41:33PM +0100, Upayavira wrote:
> Just to keep you up to date with my work on the CLI:


> I found a bug which meant that Cocoon wasn't holding onto its cache each 
> time it shut down and restarted, which explained why the CLI wasn't 
> using its cache. Vadim fixed the bug.
> So, the CLI can now read out of the cache correctly. However, it seems 
> that the cache is either slightly slower than page generation, or much 
> the same, so there's no real benefit of this bug fix, at least in this area.

It would be interesting to know how much of the pipeline is actually
cached.  The times for a first and second Forrest run are 2:36 and 2:39
respectively, and they're suspiciously similar; as if the cache is being
checked but not used.  For instance, rendering site.pdf takes 20s on
first and second rendering.  The timestamps are very useful, btw!

> Also, because pages now come from the cache, pipelines aren't processed, 
> and the LinkGatherer component no longer works, so we only have LinkView 
> gathering for following links :-(

Hmm.. tricky.  If the LinkGatherer output is a byproduct of running the
pipeline, and the pipeline output is cached, then perhaps the
LinkGatherer output should also be cached?

> The one benefit of this is that it is now easy to identify whether a 
> page came out of the cache, if it did, to compare the timestamp of the 
> file on disc with the timestamp of the cached element, and only save to 
> disc if the cached element is newer. So, we haven't yet speeded things 
> up, but we have got it to only update changed files.

I don't really understand this.  Surely if site.pdf takes 20s on first
and second rendering, it's updating an unchanged file?

> So, if you are happy with link view (for the moment), and like the idea 
> of only updating pages that have changed, then update to CVS Cocoon. 
> Otherwise, stick with the one you've got.

Oh well, link-view lets us filter out unwanted links, even if it's really
the user-agent's job (you convinced me;), so I'm happy with CVS.

Oh, mind if I make one change to the output?  Instead of having the time
on a separate line:

* [0] document-v12.pdf
         [1.356 seconds]
* [38] community/howto/index.html
         [0.524 seconds]
* [0] community/howto/index.pdf
         [0.262 seconds]
* [0] /favicon.ico
         [0.052 seconds]

Have the times right-indented:

* [0] document-v12.pdf                  [1.356 seconds]
* [38] community/howto/index.html       [0.524 seconds]
* [0] community/howto/index.pdf         [0.262 seconds]
* [0] /favicon.ico                      [0.052 seconds]

It saves lots of screen bandwidth, and makes the output more parseable.

> I would be interested in your comments upon this mixed set of consequences.
> I will try to get linkGathering working again, but it does involve 
> digging a bit further than I'm used to.

In true open-source fashion, we'll be here in our armchairs cheering you
on ;)  I saw you commit some CLI refactorings - does that mean the code
stable enough yet for bystanders to start poking & trying to understand



> Regards, Upayavira

View raw message