forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: readying the cross-project build environment
Date Sat, 15 Jun 2002 15:03:31 GMT
Your emails often remind me of the wise quote: intelligence is about
posing questions, not answering them. This is for sure one of them.

Diana Shannon wrote:
> 
> On Thursday, June 13, 2002, at 01:31  PM, Steven Noels wrote:
> 
> Will site updates be based on a manual trigger or time-based trigger
> (once a day, once an hour, etc.)?

This hasn't been defined.

There has been some resistence in the past to automate tasks over the
Apache servers: this is 99% because of security risks they pose since
gaining conctrol of an automated task would lead to asynchronous attacks
which are much harder to track down.

At the same time, since the forrest process will not reside on the same
machine that serves the generated content, this shouldn't be an issue
even for root@apache.org which is normally *very* conservative (and for
a good reason!)

One big not about security of apache.org: imagine a scenario where a
black-hat hacker gains root access on www.apache.org, places a backdoor
in an official apache distribution and removes its traces.

Using proper timing and PR relationships, this person (or group) will be
able to do massive danger to the online economy first, and to the apache
brand second.

By wise forecast of stock market fluctuations, the attackers could
potentially earn millions of dollars.

This is why the apache infrastructure is managed in a very conservative
way: protecting apache means protecting the web and the ecosystem
(social and economical) on top of it, which is worth billion of dollars
and influences millions of individuals.

Since Forrest wants to become a piece of the apache infrastructure, we
must be aware of how important security should be for us.

This said, we should ask how dangerous is to have automatic forrest runs
and what is best from a usability point of view of the apache
infrastructure.
 
> > * there will be three source retrieval mechanisms:
> >
> >      - a simple filecopy (for projects local to the forrestbot host)
> >      - grabbing sources from anoncvs
> >      - fetching sources using scp (public/private key distribution
> >        required)
> 
> Sources are retrieved from what cvs branch? For example, with Cocoon,
> would this be the release branch?

Gump is able to specify what branch to checkout. Forrest should do the
same.

> What if someone is in the process of
> commiting a bunch of modified files which, if retrieved by Forrestbot at
> the wrong time, causes a build failure?

Eh, shit happens, one could say :)

No, seriously, this hasn't been a problem on Gump, which is scheduled to
run once a day. For manually-started systems, this could be worse... but
we could use the CVS lock mechanism for that.... even if it's normally
suggested *not* to use locking in a public environment.
 
> What if I'm updating release and want to turn off any Forrest retrieval
> until I'm finished. Will I be able to do that?

One simple (hacky?) way of doing this is to place a 'retrievable' flag
someplace in the CVS module. It could be a file, an element in the xgump
file and so on.

This is much ligther than a lock  since it doesn't require the person
who locked the resource to unlock it (which could potentially create
deadlocks), but also solves the issue.

There is one question though (Sam might want to jump in here): this flag
probably breaks the concept of continous integration and might be abused
by projects that instead of fixing their docs to avoid the nagging
system, they make the project 'unretrievable' for a long period of time,
loosing the benefit of continous integration.

I'm curious to see what others think about this.
 
> >
> >    * two build mechanisms
> >
> >      - Cocoon static file generation
> What if a build fails? What happens? Is an email sent to a list?

Yes, this is the behavior I'd like. Of course, it should also provide a
gump-like list of results on a web page.

> Is the
> build tried again later (after another fresh retrieval)?

No, it should not, simply wait for the next round. Of course, if the
build fails, the generated documents should not be uploaded on the
production site.
 
> >      - webapp assembly
> >
> >    * (optional) deploy methods
> >
> >      - filecopy (also for the war?) - readying for rsync operations
> >      - cvs check-in (
> 
> Do we need live site cvs equivalent (as is the case today, where the web
> site is a locally checked out copy of the live site repository)?

We have been discussing with root@apache.org about this and it might
well be a possibility. If we add a tag for each update, we are able to
do direct revisioning of the web site.

> Do we
> need version history of what is shown on the web site? 

It could be desirable, but I wouldn't think so for our own documents
since:

 1) we already revision the original XML sources
 2) we include the build documents in every project release

Since the HTML files are simpy the 'result' of the build process, it
would be like versioning both the java and the class files, it doesn't
really make sense since both contain the same exact information.

> What if a new web
> site version, even if built correctly, is problematic and needs to be
> changed immediately. It would be nice to be able to revert to a previous
> version, as is possible right now (through the live site repository).

Then you revert the documents in CVS and the website is rebuilt
correctly. This is how you do patches in java and you recompile the
classes... you don't checkout the previous classes.
 
> >      - scp (public/private key distribution required)
> >
> > 2) we use an XSLT transformation using the Ant style task transforming
> > the forrest.xconf to a set of Ant buildfile snippets (one per
> > project) - it is likely we'll be using the Xalan redirect function
> > here - I want the output to be granular for resilience reasons - a
> > snippet will basically consist of a parametrized call of the task
> > mentioned above
> >
> > 3) these generated targets are executed by Ant
> 
> How can Forrestbot be manually stopped, in case of a problem? 

I wouldn't implement such a feature. Forrestbot shouldn't be generally
stoppable, just like Gump.

> Will each
> individual project still have that control, or will it be only within
> control of Forrest (committers)?

In my vision, forrestbot is automatic, just like Gump, it runs, build,
copies and nags. Everyday. Helps you when you do the right thing and
nags you when you do the bad thing, just like a wise boss :)

It's simple, effective and keeps up the social experiment concept that
Gump started.

The only difference is that Gump doesn't affect anything directly (we
don't build distributions with Gump or things like that), while Forrest
will do both: perform the social test *and* do actual work on the
project (by publishing their docos)

Is this difference important enough to force us to change the Gump
model?

What do you think?

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<stefano@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



Mime
View raw message