commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Craig McClanahan <>
Subject Re: [chain] Pipeline implementation
Date Fri, 17 Sep 2004 18:56:34 GMT
On Fri, 17 Sep 2004 12:10:36 -0600, Kris Nuttycombe
<> wrote:
> Hi, Craig,
> I agree that the second approach seems to be the most appropriate and
> flexible. From our current architecture it should be trivial to
> implement an adapter from a pipeline stage that uses a chain to
> determine the processing logic for a thread.
> How should I proceed as far as setting up the sandbox project goes? For
> starters I can send you the current codebase and maven project files.

That would be helpful -- you can send them to me (at offline if you want.

One other bureaucratic detail will become important before I can
actually commit the code, even to the sandbox -- Apache requests a
Contributor License Agreement for contributions of significant code
bodies (as this would be) from all the original authors.  For more
info (and a link to the relevant documents), please see:

In addition, the proposed code will need to have the Apache License
(version 2) on each source file, as described on the same page.

One other process note ... I'm halfway through an extended trip out of
the country, and will be travelling over the weekend, so it'll likely
be at least Monday before I can do anything concrete.


> Thanks for your interest!
> Kris
> Craig McClanahan wrote:
> >Hello Kris,
> >
> >The pipeline support you describe does indeed sound interesting, and
> >especially effective in application environments where multithreaded
> >processing support is appropriate.  In general, that has tended not to
> >be the case in  the area that motivated creation of [chain] -- web
> >applications (for example, you should not be accessing a single
> >HttpServletRequest instance from more than one thread), so I didn't
> >think about multiple thread support when creating the original [chain]
> >architecture.
> >
> >Assuming I understand what you describe well enough, I also agree that
> >there could be substantial overlap between what you describe and
> >[chain] as it currently exists -- to the point that sharing common
> >infrastructure would seem to make a lot of sense.  Indeed, it seems
> >like the common concept would be how to specify what happens "in
> >between" the blocking queues, and it's the queue and thread management
> >that is unique.  Does that sound right?
> >
> >If so, at least three ways we could proceed:
> >
> >* Add a separate Commons package ([pipeline]?) for the multithread
> >queue management
> >  stuff, which depends on [chain] for the implementation of what
> >happens on a thread
> >
> >* Create [pipeline] as above, but have it define it's own interface for how you
> >  plug in the processing logic for a thread, and then have [chain] or [pipeline]
> >  provide an adapter so that you can use a chain to specify the
> >processing logic.
> >
> >* Add a layer in [chain] to provide the multithread queue management stuff
> >  as part of the same package, but not required to use a chain in a
> >single thread.
> >
> >All three approaches seem viable, but based on my current
> >understanding of what you are describing it seems like the second one
> >might be the best.  The queue management sounds like something that is
> >generally useful in its own right, no matter how you choose to
> >implement the actual processing logic.  I'd be happy to further
> >discuss the overall approach too -- I'm not married to any particular
> >approach.
> >
> >If that sounds like a good idea to you, I'd be happy to work with you
> >to create a Commons Sandbox package to let us share and experiment
> >with the code.  I can do the commits until you or others on your team
> >got voted to be committers on Jakarta Commons.
> >
> >Would this sort of approach be of interest?
> >
> >Craig McClanahan
> >
> >
> >On Fri, 17 Sep 2004 11:28:07 -0600, Kris Nuttycombe
> ><> wrote:
> >
> >
> >>Hi, all,
> >>
> >>I'm writing to get some advice and perhaps offer some code that may be
> >>useful to the commons-chain project or elsewhere.
> >>
> >>The group I work for does a large amount of data processing and we are
> >>working on solutions for pipelined data processing. Our current
> >>implementation uses a pipeline model where each stage in the pipeline
> >>has an integrated blocking queue and an abstract process(Object o)
> >>method that is sequentially applied to each element in the queue. When a
> >>stage is finished processing, it may pass the processed object (or
> >>products derived from it) onto the input queue of one or more subsequent
> >>stages in the pipeline. Branching pipelines are supported, and the whole
> >>mess is configured using Digester.
> >>
> >>There's a lot of similarity here with the chain of responsibility
> >>pattern that commons-chain implements, but subtle differences as well.
> >>Each stage runs in one or more separate threads and we are working to
> >>allow the processing to be distributed across the network. The pipeline
> >>model assumes that each object placed in the pipe is going to be
> >>processed by every stage, whereas to my understanding the chain of
> >>responsibility is more designed for finding an appropriate command to
> >>use to process a given context. Also, the pipeline is designed to run as
> >>a service where data can be provided for processing by automated
> >>systems. For example, data being beamed down from a satellite can be
> >>aggregated into orbits that are then passed into the pipeline for
> >>generation of geolocated gridded products, statistical analysis, etc.
> >>
> >>Our group would really like to be able to contribute some of this code
> >>back to the commons effort, since we use a ton of commons components.
> >>The amount of overlap with commons-chain is significant, but I'm not
> >>sure it's a perfect match because of the differing goals. Does anyone
> >>out there know of other similar efforts? Is there a place for this sort
> >>of code in commons? Are we just missing something fundamental about
> >>commons-chain where we should simply be using that instead?
> >>
> >>Suggestions would be much appreciated. I'm happy to send code, examples,
> >>and documentation to anyone who's interested.
> >>
> >>Thanks,
> >>Kris
> >>
> >>--
> >>=====================================================
> >>Kris Nuttycombe
> >>Associate Scientist
> >>Enterprise Data Systems Group
> >>CIRES, National Geophysical Data Center/NOAA
> >>(303) 497-6337
> >>
> >>=====================================================
> >>
> >>---------------------------------------------------------------------
> >>To unsubscribe, e-mail:
> >>For additional commands, e-mail:
> >>
> >>
> >>
> >>
> >
> >---------------------------------------------------------------------
> >To unsubscribe, e-mail:
> >For additional commands, e-mail:
> >
> >
> >
> >
> --
> =====================================================
> Kris Nuttycombe
> Associate Scientist
> Geospatial Data Services Group
> CIRES, National Geophysical Data Center/NOAA
> (303) 497-6337
> =====================================================
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message