gump-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sam Ruby <ru...@apache.org>
Subject [RT] gump.py
Date Sun, 13 Apr 2003 14:14:54 GMT
Preface
-------

[RT] are Ramdom Thoughts.  This is a tradition in the Cocoon community. 
  RTs are basically long and thought-provocing mails with new project 
propositions, that are discussed and scrutinized at length.  One 
distinguishing characteristic about RTs is the complete and utter lack 
of consistency with respect to quality: some are pure crap, others are 
pure genius.  Even the original author of a RT is not sure which 
category any given posting falls into at the time it is issued.  This 
posting is no exception.

Motivation
----------

The core values I employed when I wrote the original Gump was that it 
was minimalistic in two key ways: prereqs and runtime.  In terms of 
prereqs: while things Gump builds may have prereqs, all the current 
version of Gump itself requires to build and execute is an operating 
system and JDK 1.4.  In terms of runtime, there is no Gump code running 
during the actual builds - the only thing that remains is a script which 
sets various environment variables and invokes your existing and 
unmodified build.xml with various properties.  The only significant 
difference between the way Gump runs and the way developer do builds 
should be the value of the build.sysclasspath property, and this is 
something that can be set and experimented with outside of Gump.

Key issues with the current Gump implementation are its reliance on DOM, 
XSLT, and generated BAT/bash scripts.  These choices represent barriers 
to entry for potential contributors to the Gump codebase.  They also 
make it more difficult to implement appropriate error recovery and 
provide meaningful error messages when the inevitable configuration 
errors occur.

Approach
--------

This RT explores an alternate implementation in Python.  If you have a 
Unix or Linux machine, you probably already have Python installed.  For 
Windows users, this can be obtained for free from ActiveState [1].

The version used for development is Python 2.2.  On Redhat machines this 
is named 'python2'.

Python is dynamic, strongly typed, object oriented scripting language 
and comes with XML support.  It is unlikely that a Python would interact 
in any unintended way with Java based builds, so it meets the runtime 
requirement described in the motivation section above.

Design
------

The current implementation [2] has four basic parts.

SAXDispatcher maintains a stack of active elements during the parse of 
the workspace and dispatches incoming events to the appropriate element.

The second part is a set of various base classes.  This RT won't go into 
detail on their implementation at this time, I'll just mention that they 
implement Gump's current ability to have profiles and workspaces extend 
packages and modules.  And it make extensive use of Python's ability to 
dynamically define what properties an object has.

Next is the actual Gump Object Model.  The way the base classes are 
defined, only the non-leaf parts of the gump object model which actually 
have logic associated with them require definition.  The definitions 
consist of simple declarations of expected elements, and whether one or 
multiple elements are expected of each.

The last part is some demonstration code.  Rubix.xml is parsed and then 
information from the workspace, a simple project (junit), a project 
which has it's definition partially in the workspace and partially in a 
separate module file merged into one (gump), and a project which is 
augmented by information from the profile (jdbc).

Future
------

At the moment, this information is merely loaded from disk without 
interpretation.  The logic which is currently in Jenny (e.g., resolving 
properties and expanding dependencies) needs to be reproduced in Python.

Next, we need to figure out how to handle external project definitions. 
  The speed of this script would degrade significantly if these were 
read every time.  Either some sort of caching which is only explicitly 
updated based upon request is needed, or a serialization of the full 
merged definition (like to the current Jenny logic does) is required.

Finally, the logic current placed in XSLT stylesheets would need to be 
accomondated.

And clearly, this script would need to be split into multiple files.  It 
currently is in one file for demonstration purposes.

Notables
--------

This implementation differs in the way that hrefs are handled: namely 
that this attribute is only looked for and processed uniquely for 
classes which inherit from the 'Named' base class.

Given the way that this implementation unifies elements and attributes, 
it is regretful that projects have a package attribute (which indicates 
where the installed packages reside) and a package element (which 
identifies the java package(s) a given project implements).

Epilogue
--------

I'm only interested in pursuing this if it increases the development and 
user community.

- Sam Ruby

[1] http://aspn.activestate.com/ASPN/Python
[2] http://intertwingly.net/stories/2003/04/13/gump.py



Mime
View raw message