cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <>
Subject [RT] Cocoon's own publishing system
Date Tue, 11 Mar 2003 15:49:22 GMT
Diana Shannon wrote:
> On Tuesday, March 11, 2003, at 05:32  AM, Pier Fumagalli wrote:
>>> I suggest linking to but
>>> I guess that's your intention ;-)
>> Yes, but at that point we'll have to re-build the site once again...
> Ok, we shouldn't be so limited in rebuilding the site as often as 
> necessary. 

The wiki shows pretty evidently that our documentation publishing system
is currently too hard to use and to slow to work with. This is evident
when people are even afraid to touch it.

We must fix this.

> Here's why it has been tedious and time consuming in the 
> past, at least for me.

Ok, let's start planning something better.

> 1. Many, many committers weren't updating release and head branches with 
> their doc updates. It took time to scrutinize differences in the 
> branches, to make sure all relevant docs were in the release branch, 
> which is what is used to generate the web site.

Agreed this is a problem.

Hopefully, now that we have two clearly separated repositories, people
will document only in the appropriate one.

> 2. Updating the live site repository is time consuming, at least for me, 
> on a slow dial-up connection (I live in a rural area of the US with no 
> broadband option). The api docs directory is the time killer here. I 
> spent eight hours, one night, simply performing a cvs update followed by 
> a cvs commit. The most recent update wasn't so bad. The commit/update 
> took only 2.5 hours.

Oh, god. I didn't know that. This is a shame. I'm sorry Diana. I know 
you don't pay dial-up per-minute fees as we normally do here in europe, 
but still.

we must make a better system.

> 3. I was really excited about Forrest transition, thinking the 
> automation would save me all of the above time which I could devote to 
> docs content. Unfortunately:
> - only a few committers participated in the trial run, so it seemed to 
> me, interest/support is not that great.

I would like to know the issues that are still left on the table to
solve and work on them. Forrest is clearly the way to go and the site
transition give us the opportunity to think about it.

> - Forresters seemed to suggest, and I could be wrong, that the live site 
> cvs update would **still** be required even with Forrest. 

No, not necessarely, but a totally different system must be setup in place.

> Thus, I failed 
> to see how the transition would make my volunteer committer life any 
> more liberated, since this time killing step was still necessary.

This *must* go away.

> I'm happy to help with updating the site based on the revised cvs mirror 
> links discussed in this thread. However, I can't do it until later this 
> week. In the future, I think it's better if more committers would share 
> the burden of updating the live site cvs every now and then, 
> particularly those with greater bandwidth connections. In the hopes that 
> this will happen, I'll post detailed instructions on how this can be 
> done on wiki. (I've posted email instructions on two separate occasions 
> in the past which I will now fine tune.)

Please do, those will help, but for now, let's clear the whiteboard and
start outlining the best publishing system.

                                 - o -

Apache has very high security standards. If our web sites get hacked, 
the ASF image of quality and security is damaged. Having easier to 
install docs, but lower security is not an option.

This means that every solution must be *designed* around security.

IMO, the metapattern of IoC gives us a lot of security. So, the best 
publishing system would be something like:

  repository -(generation)-> staging -(publishing)-> production


  repository is used for storing our documents

  stating is the location of the staged documentation

  production is the location of the final docs

 From a security analysis, a compromise of the staging area is not a big 
risk, this means that staging can be automated without major political 

While the above arrows indicate the flow of data, the flow of control 
must be inverted to provide complete vertical security:

  repository <-(reads from)- staging <-(reads from)- production

                                  - o -

So, here is the plan I propose:

1) repository is CVS on icarus. as it is today. no changes required in 
the editing/authoring process (for now, at least)

2) automated staging server is moof (or nagoya)

I'd suggest to install it on moof (or nagoya) [moof is a macosx server 
donated to the ASF by apple and located in their campus, lots of 
bandwidth and support for final java 1.4.1 as for yesterday, administred 
by the apache instrastructure]

Checkout is done over anoncvs, so no possible compromise from the 
staging server to the repository.

3) the staging server should work for docs as gump works for nightly 
builds, nagging the appropriate mail list if:

  - docs aren't valid
  - links are broken

Note that javadocs and idldocs must be automated as well.

4) the staging server will regenerate results automatically grabbing the 
changes out of CVS directly. this operation will perform unassisted, 
just like gump. in fact, forrest was born to be the gump alter-ego for 

5) when publishing on production is needed, a person with an account on 
icarus will simply log in and perform a remote rsync between the staging 
area and the server. A simple script with a readme will do the job. I 
estimate it might take less than 60 seconds to update the web site this 
way since the thruput between moof and daedalus is several Mbits.

                                  - o -

 From a usability point of view we gain:

- no need for broadband. This means that people can update the site even 
on a GPRS cellphone, or on a slow link in africa.

- we are nagged if things go wrong: continous integration for documents.

- the site update frequency will hopefully improve, thus improving our 
quality of service.

 From a security point of view:

- we use existing ASF-proven security infrastructure (everything is done 
over SSH)

- by keeping the stating server on a different machine and with no 
information on how to access the others, we don't create more points of 

                                  - o -

NOTE: from an operativity point of view, Pier has enough karma to setup 
everything we need on moof or nagoya, as well as providing accounts for 
those who want to help running the staging server (I would suspect Jeff 
and Steven to be interested in helping out, hopefully others as well). 
We might need to post our plan of action to infrastructure@ once we 
decide what to do, but since there are no security issues they shouldn't 
be concerned about it (I've already discussed this architecture and 
people didn't have objections).

Comments and suggestions will be very appreciated.



View raw message