commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Karasulu <>
Subject Re: [chain] Pipeline implementation
Date Mon, 20 Sep 2004 16:32:31 GMT
Hi Kris,

On Fri, 2004-09-17 at 19:29, Kris Nuttycombe wrote:


> >>The model that we've been using is that stages process data instead of 
> >>events, although one could certainly consider an event as a type of 
> >>    
> >>
> >
> >Ahh that's interesting.  We do the events and carry a payload.  One of
> >the benefits we get with using an EventObject derived event is a nice
> >event type heirarchy that can be used to filter when routing events. 
> >Also we can associate other peices of information with the data to
> >control how it is processed.  One of the things we're working on in
> >particular where this is coming in handy is for synchronization within
> >the staged pipeline.  There's much we have to do here though.  I'm
> >toying with implementing rendezvous points for events and using other
> >constructs for better processing control of entire pipelines.
> >  
> >
> How is routing handled if you have multiple subscribers registered that 
> can handle the same type of event? 

The event is delivered to all subscribers then.

> Is it possible under your framework 
> to explicitly specify the notification sequence such that an event is 
> handled by one subscriber which raises the same variety of event to be 
> handled by the next subscriber?

I think we're currently exploring things like that.  Actually this is a
pretty good idea in some respects to achieve something we have been
searching for.  Thanks for the idea I need to lull it over a bit.

>  I ask because one of the main things we 
> do with our implementation is sequential processing where the input data 
> may be valid for a number of stages, but the order of the stages is 
> critical for correct results to be produced. For example, we do a lot of 
> processing of spatial data. A reader stage may generate a sequence of 
> geometry objects from a datafile which are enqueued on a subsequent 
> stage. We may apply line generalization and polygon splitting algorithms 
> to these data sequentially, and the resulting values are quite different 
> depending upon which order these filters are specified in. The stages 
> that wrap these algorithms are generic; the only information that they 
> have is that they should expect the input queue to contain geometries, 
> or perhaps beans that have one or more properties that are geometries. 
> The execution order is defined externally in a configuration file. We 
> don't suffer from strong coupling between stages because we give up on 
> compile-time type safety and leave it up to the stage to determine what 
> to do if it's fed an object of an unexpected type.

You still don't need to give up on type safety with this pub/sub model
because type safety is honored in the event types and the subscribers. 
But you still have dynamism.

RE: the synchronization constructs needed for sequential processing
through portions of the pipeline we are in the process of thinking about
how we can do this without a lot of synch overhead.  Perhaps we can
think about this one together.

> >>data.  Our stages have to be aware of the  pipelines in which they 
> >>reside because our Stage interface defines additional "exqueue(Object 
> >>obj)" and "exqueue(String key, Object obj)" methods which are used to 
> >>enqueue an object on either a subsequent stage or a keyed pipeline 
> >>branch, respectively.
> >>    
> >>
> >
> >I had the same problem which created a high degree of coupling between
> >stages.  Since stages were implemented in IoC frameworks sometimes there
> >were complaints in complex systems where cycles were introduced.  I
> >started using a simple pub/sub event router/hub to decouple theses
> >stages.  Ohhh looks like you ask about that below...
> >  
> >
> The way that we avoided problems with cycles was pretty simple; we don't 
> allow a stage to exist in more than a single pipeline simultaneously. 
> Stages are pretty lightweight so it's not a hassle to create multiple 
> stages of a single type if that's what's needed. If there's an explicit 
> need to create a cycle, it's trivial to write a stage that injects data 
> it receives elsewhere in the pipeline and you treat it like recursion, 
> making sure that the exit condition is well specified somewhere within 
> the cycle.
> >We have a service we've defined called the EventRouter along with
> >Subscriber's and Subscriptions.  It's like the core dependency; instead
> >of having every stage depend on others downstream we make each stage
> >dependent on the EventRouter.  Basically this forms a hub and spoke like
> >dependency relationship between the stages and the event
> >broker/router/hub whatever you like to call it.  Now we can dynamically
> >register new Subscriptions with it to route events to different stages. 
> >We use the event router to handle configuration events while tying
> >together the pipleline as well as for the inband processing of data
> >flowing through the system.
> >

> So is the usual use case that a StageHandler will handle one event and 
> generate another that is referred back to the router for handling? 

Yes this is valid to say.

> Doesn't this cause problems if your StageHandler raises the same variety 
> of event that caused it to be invoked in the first place?

If there is not terminating condition yes I would suppose so however
none of our stages generate the same event they process.  If they did
this could produce an infinate event processing loop.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message