cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thorsten Scherler <>
Subject Re: Reuse of pipelines in java
Date Tue, 16 Aug 2011 08:08:26 GMT
On Tue, 2011-08-16 at 09:25 +0200, Steven Dolg wrote:
> Am 14.08.2011 14:18, schrieb Sylvain Wallez:
> > Le 12/08/11 21:08, Thorsten Scherler a écrit :
> >> Hi all,
> >>
> >> I am migrating a StAX development from a customer to c3 StAX, since the
> >> resulting code will be much more generic and understandable.
> >>
> >> In my case I need to process all files from different folders, parse
> >> them and invoke a second pipeline from the main pipe.
> >>
> >> Meaning I have one principal pipeline which I need to repeat x times.
> >> I started to create the pipeline and it works very nice, however I
> >> encounter some downsides with reusing the pipe.
> >>
> >> I found that you can execute a java based pipe exactly one time. There
> >> is no such method to reset the pipe. My plan was to inject the pipeline
> >> in my main code and then configure it on the Fly (reusing the same pipe
> >> on different files).
> >>
> >> Further there is as well no way to dynamically change the different
> >> components once added to the pipe.
> >>
> >> I mean
> >>
> >> Pipeline<StAXPipelineComponent>  pipeStAX = new
> >> NonCachingPipeline<StAXPipelineComponent>();
> >> pipeStAX.addComponent(new XMLGenerator(input));
> >> ...
> >> pipeStAX.setup(System.out);
> >> pipeStAX.execute();
> >>
> >> Now my question is how people feel about:
> >> a) Making java based pipes resettable pipeStAX.reset()
> >> b) Adding a method like pipeStAX.getComonponet(int i) to retrieve the
> >> component x in position i.
> >
> a) What exactly should Pipeline.reset() do? (Besides calling reset on 
> each component)
> And what should a component do during a reset?
> I think components can be configured/set up as often as you like.
> b) If you construct the components directly, can't you keep a reference 
> to them and just call the setters/methods directly when needed?

Yes, but only if I can execute the pipe x times.

> I guess I don't understand why the pipeline is not reusable in your case 
> or what you need to reconfigure between the runs.
> Maybe you need x different pipelines for x different configurations?

If you see above example you cannot do 

// first time works fine
// the next time it is doing nothing

The second call of execute() will not do anything. The reset() or
redeploy() would refactor the pipeline to be usable again. 

Then I can configure the different pipes again tu use it x times. 

> > Although reset() can allow pipeline reuse, it won't solve the problem 
> > when you have multiple concurrent threads that could benefit from 
> > reusing the pipeline.
> >
> > Cocoon 2.x had component pools to allow reuse in a multithreaded 
> > context while avoiding the big cost of reparsing the component's 
> > configuration, but this proved to have a significant overhead.
> >
> > A solution that wouldn't require much changes in the current API would 
> > be to require pipelines and pipeline components to be Cloneable, so 
> > that you could build a pipeline instance once at startup and then 
> > clone it each time you need to use it. That would require component 
> > writers to be careful about cloneability though.
> >
> > Sylvain
> >
> Pipelines are not thread-safe!
> I think the effort required to make them thread-safe is far too great 
> given the (IMO negligible) benefits.
> Since everyone can create their own pipeline components there is no way 
> to guarantee that it will work correctly all the time.
> (I don't think "should work in multi-threaded environments if the 
> component developer didn't make a mistake" should appear in any 
> documentation)
> In the case mentioned above (direct Pipeline API calls) component 
> instances are created by the user's code, so the responsibility of doing 
> that efficiently and correctly is the user's and not ours, IMO.
> Something like a component factory / provider is currently well outside 
> the Pipeline API's responsibilities - actually it's part of the sitemap 
> - and I think it should stay that way.
> I see the Pipeline API as a small library that provides some helpful 
> classes, which you use in a very controlled and precise manner (like 
> commons-lang, commons-io, etc.)
> Not like a full execution environment with it's own flow of control (you 
> get that when you use cocoon-servlet with sitemaps).
> If you really need/want more efficient construction of components, give 
> this task to someone who specializes in that.
> Make a Spring context and use prototype beans or even create an object 
> pool, or use some other dependency injection container you like.
> I don't think we should try to compete with those frameworks on their 
> home field.

I agree to not reinvent the wheel in that areas, maybe we can learn from
the thread "Springification" to implement this.

Thorsten Scherler <>
codeBusters S.L. - web based systems
<consulting, training and solutions>

View raw message