forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thorsten Scherler <>
Subject Re: A rethink of the Ivy migration
Date Wed, 28 Feb 2007 22:55:30 GMT
On Wed, 2007-02-28 at 12:30 +0000, Ross Gardler wrote:
> I've been having a rethink of our approach to the IVY migration.
> Pretty much all of our java dependencies are Cocoon dependencies. So why 
> should we bother to manage them ourselves?

Totally agree. All dependencies should come from cocoon but cocoon would
need to have an ivy.xml for it, right?

The route that we are going ATM is very good for Ivy because we will
hand over a first class rep but like you point out the java dependencies
are Cocoon dependencies and we should manage the our specific ones.

> So why don't we use SVN head?
> Because SVN head has no crawler.
> So, we have two choices for progression.
> Continue with the path I have outlined for the IVY migration whereby we 
> use our own local repo to hold the specific version of Cocoon that we 
> know will work for us, or, we can use the move to ivy to force us to 
> upgrade to Cocoon head (something we should probably do for the 0.8 
> release anyway).

That I am actually not sure. Seeing that 0.8 is way overdue updating
cocoon will not help to release faster.

> The latter route will require us to build a crawler, but Thorsten has 
> already started on that with the Droids lab.

Yes, but the plugins that I wrote are not yet ready to go prime time. It
should be easy to make them stable (more if we going to use them here)
but ATM there are not (needs more eyes and input). I just changed the
dependencies management to ivy today to get some practice for our

Droids just has a couple of dependencies not like cocoon, so I just
needed to add one jar in an ivy repository. Actually like Ross points
out forrest will end with nearly only one dependency to cocoon. The rest
is coming from cocoon.

> I've tried setting it up with just a cocoon-core 2.2-snapshot 
> dependency, there are a couple of unresolved dependencies, but I am sure 
> these can be easily addressed.
> The big advantage in going the latter route is that it will make future 
> upgrades to Cocoon a simple case of running an ant task. The downside is 
> that we will have to write a crawler (using the Droids lab code - which 
> is largely based on the crawler from Nutch if I understand correctly).

I took Nutch and ripped out the plugin/extension point framework and
wrote some PoC plugins. I changed many thinks to make the code simpler
so from Nutch original code is not much left. The first crawler is not
close to the one from nutch but via plugins one could implement the same
functionality (we do not need this for our use case). 

The implemented crawler x-m02y07 is more wget style and we can use it as
base for the cocoon cli. I actually wrote the crawler having our use
case in mind. 

The critical issue is that we will need time to stabilize this new
crawler since forrest is heavily based on the static export. 

The core is the link recognition where we can go different routes. In
the short term we can enhance the parse-html droids plugin with neko
html (similar route as nutch is going) but in the long run we should try
to incorporate a virtual browser like Stefano pointed out on the labs

> So, should we continue with managing our own dependencies or should we 
> jump the short term hurdle and get the ivy branch working with Cocoon 
> 2.2 snapshots?

The last option is the one that we would need in the future in any way.
Still the question is whether we want to put the release on hold for
another while.

I personally think that having only one dependency on cocoon is the only
think that makes sense for us. I agree to Ross question: why should we
manage cocoon dependencies?

Thorsten Scherler                       
Open Source Java & XML                consulting, training and solutions

View raw message