forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Crossley <cross...@apache.org>
Subject Re: Quick glimpse at Forrest's build times
Date Thu, 18 Aug 2005 14:57:49 GMT
Ron Blaschke wrote:
> I did a quick comparison of Forrest revision 231407 (before Cocoon
> update) and revision 230867 (somewhere near HEAD).

Flip those revision numbers of course.

Thanks a million for starting this Ron.

Hopefully it will lead into techiques that we can
use to continually monitor and fine-tune our product.

> Unfortunately, very little showed up.  Here's what I did.
> 
> I ran a CPU profile on both revisions, which wasn't very helpful at
> all.  Nothing interesting showed up.  "chaperon" seems to eat a
> significant amount of CPU time, though.

Is Chaperon being used all the time, or just for
certain processing? The latter i hope.

> Then I tried something else: Taking a look at the numbers output by
> Cocoon, which prints statements like the following.
> 
> * [3/43]    [8/25]    1.892s 5.4Kb   pluginDocs/index.html
> * [4/43]    [1/25]    0.962s 5.7Kb   issues.html
> * [6/42]    [1/29]    0.971s 6.2Kb   tools/index.html
>                       ^^^^^^
> 
> I decided to compare the time reported between the two revisions,
> using a single run on each revision.
> First the totals.
> 
> Revision 231407  Total time: 20 minutes 31 seconds ( 624 seconds)
> Revision 230867  Total time: 10 minutes 24 seconds (1231 seconds)
> 
> Seems to be similar to the time difference reported by others.
> 
> There's a difference of about 607 seconds to account for (on my box).
> I matched the individual times for each document, using a simple
> script.  The result is like this:
> Page                         Old       New      Diff
> /docs_0_60/changes.html      4.487s -> 9.534s    5.047s
> /docs_0_60/changes.pdf       5.538s -> 9.914s    4.376s
> /docs_0_60/changes.rss       0.170s -> 0.110s   -0.060s
> /docs_0_60/compliance.html   0.942s -> 1.722s    0.780s
> /docs_0_60/compliance.pdf    0.440s -> 0.241s   -0.199s
> /docs_0_60/faq.html          1.712s -> 8.232s    6.520s
> /docs_0_60/faq.pdf           0.882s -> 6.359s    5.477s
> ...
> 
> I pulled everything into a spreadsheet, and took the total of all time
> differences.  The total is about 602 seconds, in my opinion close
> enough to the expected 607.
> 
> Below are the pages that contribute 5 seconds or more to the
> difference.  All numbers are in seconds.
> Be warned that I currently don't know if these numbers have any
> significance at all.  They may mean a lot, or nothing at all!  The
> main reason I am presenting them is that I'll probably don't
> have any more time for this issue this week, and they may (or may
> not!) be useful to someone.

Well they certainly prove that it is damn slow. :-)

> Note that many pages seem to be generated twice, because of slightly
> different URLs; eg
>     /docs_0_80/changes.pdf          2,453   14,411  11,958
>     docs_0_80/changes.pdf           3,184   14,772  11,588
> (second and third line in the table below)

Yeah, i have wondered about that for ages. A build
of site-author today reports building 785 documents
yet 'cd site-author; find . -type f | wc -l' gives 442.

I wonder if that is something to do with
resources/stylesheets/absolutize-linkmap.xsl

> Also note that the times (generation and difference) may be close
> together, like above.  But they may also differ quite significantly.
>     docs_0_60/changes.html          2,003   13,409  11,406
>     /docs_0_60/changes.html         4,487   9,534   5,047
> Not quite unexpected, as these times are wall clock. There are other
> processes running on my box, the garbage collector, ...    
>
> Page                            Old     New     Difference
> docs_0_60/changes.pdf           2,653   15,673  13,020
> /docs_0_80/changes.pdf          2,453   14,411  11,958
> docs_0_80/changes.pdf           3,184   14,772  11,588
> docs_0_60/changes.html          2,003   13,409  11,406
> /docs_0_80/changes.html         1,973   12,718  10,745
> /docs_0_70/changes.html         2,684   13,339  10,655
> docs_0_70/changes.pdf           5,188   15,623  10,435
> docs_0_80/faq.html              1,312   11,386  10,074
> /docs_0_80/faq.pdf              2,043   12,077  10,034
> /docs_0_70/changes.pdf          2,623   11,757  9,134
> docs_0_70/changes.html          1,592   9,934   8,342
> index.html                      1,221   9,184   7,963

This is the strangest one. The index page is simple.

The other pages like faq, changes, include a cumulative
effect because there is more complex Cocoon sitemap
for those.

Cocoon has a basic Profiler block that we can use to
debug certain parts of our sitemap process.
See their live demo:
http://cocoon.zones.apache.org/demos/release/samples/blocks/profiler/welcome
We have the Cocoon Profiler block in our core already.

Perhaps we should also look at simple things
like rolling back the versions of Xerces and Xalan.

-David

Mime
View raw message