cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <>
Subject Re: Flow wishlist :)
Date Mon, 02 Dec 2002 23:57:05 GMT
Hunsberger, Peter wrote:

> Let me be clear that I'm not looking for dynamic pipeline generation.


> The
> mapping of URI to generator is well defined for everything that what we want
> to do.  The selection of transformer is a little less so; for example, the 1
> result vs. multiple results example we talked about earlier. That's still a
> static sitemap, but the transformer is chosen at run time; so the
> understanding of dynamic vs. static sitemap should be clear: dynamic sitemap
> would mean building the sitemap every time it's run.  I can't imagine anyone
> that would want that? 

The ability to attach new components to pipelines at runtime has been 
asked for in the past and I've always been against this (and I still am).

>>>Basically, we've got a requirement for a rules based evaluation of 
>>>context data.  I don't want to code this in the sitemap language and I 
>>>don't want to hard code it in Java, I really want dynamic rules 
>>Look: I really don't get what you mean by this. Sorry, I'm slow 
>>sometimes: can you show me an *explicit* example of your functional 
>>needs? otherwise I don't feel I can be much helpful if we keep this 
>>level of abstraction without me understanding where you want to go.
> I'm not sure I can explain this via e-mail much more than I have. Maybe this
> needs some background in rule based systems or expert systems design; I
> don't know how much you may have encountered such things?

I don't have practical experience with rule-based system, no.

> Let me use an example I've given previously on the list:

sorry, I must have missed that. Thanks for taking the time to write it.

> Patient privacy
> rules are such that it's possible for a researcher to be doing research
> using patient data and not be allowed to know the identity of the patient.
> It's also possible that there is such a small patient population for a given
> treatment protocol that a  combination of very few searches would be needed
> to uniquely identify a patient.  For example, for a given  protocol there
> may be only a single patient born between the years 1980 and 1985 living in
> Tennessee. Thus, the rules might be that we allow a search by birth date if
> the user hasn't previously done a search by geography (or vice-versa).  We
> have to evaluate each action in the context of previous actions on the same
> data.  So as a researcher uses the system he builds up this trail of history
> data that starts to follow him around and accumulate; they have done action
> X in the distant past, action Y more recently, then action Z just now:
> therefore, in the current context (of having just done Z) action Z is not
> allowed or it is (because actions X and Y are still considered relevant).
> This history data doesn't just come from a single source; it can come from
> external systems so we really want to use some generic format to process it,
> as such XML is well suited. 


> In-other-words, for us, the decision on what action to take at any given
> point is dynamically evaluated (like most action handlers), but the
> decisions are just based on one heck of a lot of complex processing (and not
> just simple form field evaluation), this processing falls into the general
> CS pattern of expert systems processing and more specifically rules based
> expert systems (among other things). As a result, in our particular case,
> functional programming meets our needs better than most other solutions.  

Cool, I trust your reasoning on this and I'm starting to understand why 
you want special XSLT templating where rules are stored into databases.

But I think that what you are looking for is a special transformer. It 
doesn't require changes to what cocoon pipelines are. At least, I don't 
see why it should.

>>>We know a
>>>functional programming model can provide this.  Just so happens we 
>>>have such capabilities at hand via XSLT if we can feed it rules 
>>>selectors described as XML.
>>Sounds like the good old 'golden-hammer' antipattern to me. But I won't 
>>comment further until I understand your requirements.
> Perhaps so, if it wasn't so easy to get the data into XML format we might be
> looking at wiring in LISP or Haskell (or whatever) processing on the data,


> but as it sits XSLT is an obvious way to go to meet this need. I will
> observe that it's currently even harder for us to find LISP or Haskell
> developers than XSLT developers (we keep looking)...

again true

>>>In-other-words: currently, sitemap has access to context via URI, 
>>>parameters, generators, etc.  Based on this, sitemap spits out a 
>>>decision on what transform to use.
>>That's one way of looking at it. Another is that your functional logic 
>>could be directly included in the generator.
> Well yes, and that's sort of what I am proposing.  However, let me note
> that, taken to the extreme, your  statement is equivalent to saying that
> everything can be done with a single generator.  

True. One could say that the hard thing about designing a pipeline is 
where to stop separating the components (this is true for any 
cluster-oriented paradigm like OOP, COP, AOP)

> Even I'm not proposing
> that, though for us I'll end up reducing the number of required generators.
>>>What I want instead is to feed an XSLT this same set
>>>of context as XML and have the XSLT pick the subsequent transform to 
>>>use. The advantages to me are; 1) I can code in XSLT instead of 
>>>sitemap language;
>>>2) I can optimize the entire chain of events since the transform picking
>>>XSLT can pass on the context to the next transform (standard transform
>>>chaining); 3) I get a functional programming model (not an advantage to
>>>some, I know).
>>I don't get it: you say that your requirements are so horrible that you 
>>need to keep all your rules into a database (which is a questionable 
>>sentence right there, but I don't have details to judge it). Then you 
>>say that a sitemap becomes a mess. Result: you want to write a XSLT 
>>stylesheet that uses extensions to connect to a database to obtain 
>>dynamically generated pattern-matching rules to transform an XML 
>>representation of your request context into a directly-digestible output?
> The rules don't go into the database, the rule selectors go into the
> database. I don't think any extensions should be needed; the context data
> will be created using standard generators and possibly aggregation, though
> as we proceed we're finding that our generators inherit from each other and
> aggregation isn't needed; each generator picks up what is needed
> automatically. Likely, for the other cases we'll end up using composition in
> the generators and eliminate aggregation in the sitemap.

So, why don't you so something like

  <map:transform type="xslt" src="cocoon://..."/>

where is a cocoon pipeline that generates the stylesheet you need?

>>how that is goint to be any better than a sitemap+flowmap is *very* hard 
>>to see from where I stand.
> Better is a relative thing.

Oh, totally.

> As we've sort of concluded, in some cases your
> development requirements get messy no matter what way you go.  When you're
> building systems for research it's often a case of picking the less of two
> evils...

Wise sentence. Still here you are pointing out that cocoon might require 
changes to its internal model while I really don't see this as it seems 
that your concerns happen to reside on a higher application level than 
the sitemap+flowscript (if I understood correctly, of course)

>>moreover, it sounds like an optimal solution to kill your webapp 
>>performances: that stylesheets becomes your bottleneck. So either you 
>>write your own xslt engine (or extend an existing one) to be able to 
>>optimize those database-extracted rules, or you're goint to have 
>>*serious* scalability issues right there.
> Evaluating all these rules is going to be a performance issue no matter how
> we go.  Unfortunately, the new government privacy requirements force some of
> this on us (we've still got a couple of years before they all go into
> effect).  Similarly, the complexity of the research environment forces some
> of this on us.  That's part of the "challenge" of working in research...
>>In both cases, big implementation PITA.
>>End result? your people might know the XSLT syntax, but one thing is to 
>>know the syntax of a language, another is to be able to read a 
>>stylesheet that includes hard-core functional programming connected to 
>>external datasource via extensions.
>>If you think that is going to be easier for you to find people able to 
>>read/write/maintain those hard-core stylesheets than it will be to find 
>>people that can learn the sitemap syntax and a read a few lines of 
>>javascript, I think you have some thick walls to crash into in your 
>>future :)
> As we've discussed this is going to be messy either way.  If it was just a
> few lines of JavaScript needed to do all the complex evaluation of the
> current context data then it wouldn't be an issue.  However, to code the
> processing we need would mean many 1000's of lines of JavaScript.
> Now-a-days we've got good XSLT editors, schema validators etc. so the  job
> of creating the XSLT isn't as hard as it used to be.  Functional programming
> is always an issue (sigh)...
> So far this is all working, the XSLT's keep getting smaller as we generalize
> things out and discover new generic processing patterns.  To me that's a key
> sign that we are on the right track (in the past I've also I've seen C++
> code get smaller as it gets generalized but gains more function.)
> <small snip/>
>>>However, I could also see
>>>how there might be situations where the serialization decision might 
>>>be part of the new thingy, and thus the blocks discussion and how to 
>>>hand off service calls becomes relevant.
>>It is *NOT* a transformer decision to drive the serialization process. 
>>It's against both SoC and IoC! There is nothing planned for Cocoon 
>>Blocks that will allow this to happen and as soon as I have to vote 
>>around here, you'll get my -1 on anything that makes possible for one 
>>pipeline component to modify dynamically the pipeline execution, 
>>including choosing a serializer.
> I'm not asking for the transformer to drive the serialization decision; we
> definitely want to separate those decisions!  What I was saying is that if
> you have a generalized way of extending the sitemap then the decision on
> where to plug in the serialization becomes an issue.

I would be against a generalized way of extending the sitemap. I want 
people to build consensus on this list not route around it with 
pluggable extensions.

This is more a community-thing that a technical issue, but look at the 
mess that Avalon became after allowing people to diverge without 
creating consensus :/

>  It's the resources
> discussion: a resource might do generation and transformation, or it might
> do transformation and serialization, or whatever. In my case, the question
> of whether blocks will allow this doesn't matter a whole lot, since for the
> most part I think we can behave mostly as a pure transformer.  However, I
> could possibly see a case where I want our new thingy to behave more like a
> generic resource and take over more of the otherwise standard sitemap
> processing.  

> SOC shouldn't mean that the only place you can separate the transformation
> and serialization decisions is in the master sitemap, some other
> component/block might also have a good way of separating these decisions and
> handling them...

Great, but let's try not to mix concerns: you started saying that you 
have a 'different approach' to resource production than 
sitemap+flowscript using XSLT.

I still fail to see how.

What you are presenting above is a very complex way to transform your 
data. I don't see what you can manage flow with XSLT.

Don't get me wrong, I'm not criticizing, I'm trying to understand if 
your functional requirements are something that the S+F 
(sitemap+flowscript) cannot cope with.

And even the most complex XSLT-based transformation stage is something 
that S+F are perfectly capable of doing (or, at least, I failed to see a 
reason why not)

Anyway, I really don't see how you are going to do flow description with 
XSLT, do you have a code snippet to show your point? that would be helpful.

Stefano Mazzocchi                               <>

To unsubscribe, e-mail:
For additional commands, email:

View raw message