cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hunsberger, Peter" <Peter.Hunsber...@stjude.org>
Subject RE: Flow wishlist :)
Date Mon, 02 Dec 2002 16:05:21 GMT
>> Hmm, I guess I've been exploring this long enough that I thought it 
>> was somewhat intuitively obvious, sorry...  (Someone has also remarked 
> that this sounded somewhat like a capability that was in C1.)
> 
> yeah, you could implement what you want using dynamically added PI 
> (processing instructions) for the C1 reactor, but it would end up being 
> very messy anyway.

When it was described to me it didn't really sound like what I wanted...

>> I'll skip the background for all this, but basically our requirements 
>> are a bit extreme; 1000's of variations on 100's of UI "screens"; each 
>> UI variation can change over time, and it must be possible to audit 
>> the history of changes and see data in the form that the UI was at the 
>> time the data was originally created.  As such, we need to be able to 
>> drive ALL UI construction, validation and workflow dynamically out of 
>> a database.
> 
> Ok, let's start from these requirements.
>
>> Cocoon is a good starting point for this, but as we work with Cocoon 
>> we end up coding presentation rules and presentation flow rules in the 
>> sitemap.

> Well, that's what it was designed for. The design of the sitemap was 
> *explicitly* meant to be static and remains so. So far, nobody was able 
> to come up with an example where dynamically generated pipelines were 
> the *only* way to solve a functional requirement.
> 
> Anyway, I'll be strongly against adding dynamic pipeline generation 
> capabilities, either in the sitemap or in the flowmap.

Let me be clear that I'm not looking for dynamic pipeline generation. The
mapping of URI to generator is well defined for everything that what we want
to do.  The selection of transformer is a little less so; for example, the 1
result vs. multiple results example we talked about earlier. That's still a
static sitemap, but the transformer is chosen at run time; so the
understanding of dynamic vs. static sitemap should be clear: dynamic sitemap
would mean building the sitemap every time it's run.  I can't imagine anyone
that would want that? 

>> Basically, we've got a requirement for a rules based evaluation of 
>> context data.  I don't want to code this in the sitemap language and I 
>> don't want to hard code it in Java, I really want dynamic rules 
>> evaluation.
>
> Look: I really don't get what you mean by this. Sorry, I'm slow 
> sometimes: can you show me an *explicit* example of your functional 
> needs? otherwise I don't feel I can be much helpful if we keep this 
> level of abstraction without me understanding where you want to go.

I'm not sure I can explain this via e-mail much more than I have. Maybe this
needs some background in rule based systems or expert systems design; I
don't know how much you may have encountered such things?
  
Let me use an example I've given previously on the list:  Patient privacy
rules are such that it's possible for a researcher to be doing research
using patient data and not be allowed to know the identity of the patient.
It's also possible that there is such a small patient population for a given
treatment protocol that a  combination of very few searches would be needed
to uniquely identify a patient.  For example, for a given  protocol there
may be only a single patient born between the years 1980 and 1985 living in
Tennessee. Thus, the rules might be that we allow a search by birth date if
the user hasn't previously done a search by geography (or vice-versa).  We
have to evaluate each action in the context of previous actions on the same
data.  So as a researcher uses the system he builds up this trail of history
data that starts to follow him around and accumulate; they have done action
X in the distant past, action Y more recently, then action Z just now:
therefore, in the current context (of having just done Z) action Z is not
allowed or it is (because actions X and Y are still considered relevant).
This history data doesn't just come from a single source; it can come from
external systems so we really want to use some generic format to process it,
as such XML is well suited. 

In-other-words, for us, the decision on what action to take at any given
point is dynamically evaluated (like most action handlers), but the
decisions are just based on one heck of a lot of complex processing (and not
just simple form field evaluation), this processing falls into the general
CS pattern of expert systems processing and more specifically rules based
expert systems (among other things). As a result, in our particular case,
functional programming meets our needs better than most other solutions.  

>> We know a
>> functional programming model can provide this.  Just so happens we 
>> have such capabilities at hand via XSLT if we can feed it rules 
>> selectors described as XML.
>
> Sounds like the good old 'golden-hammer' antipattern to me. But I won't 
> comment further until I understand your requirements.

Perhaps so, if it wasn't so easy to get the data into XML format we might be
looking at wiring in LISP or Haskell (or whatever) processing on the data,
but as it sits XSLT is an obvious way to go to meet this need. I will
observe that it's currently even harder for us to find LISP or Haskell
developers than XSLT developers (we keep looking)...

>> In-other-words: currently, sitemap has access to context via URI, 
>> parameters, generators, etc.  Based on this, sitemap spits out a 
>> decision on what transform to use.
> 
> That's one way of looking at it. Another is that your functional logic 
> could be directly included in the generator.

Well yes, and that's sort of what I am proposing.  However, let me note
that, taken to the extreme, your  statement is equivalent to saying that
everything can be done with a single generator.  Even I'm not proposing
that, though for us I'll end up reducing the number of required generators.

>> What I want instead is to feed an XSLT this same set
>> of context as XML and have the XSLT pick the subsequent transform to 
>> use. The advantages to me are; 1) I can code in XSLT instead of 
>> sitemap language;
>> 2) I can optimize the entire chain of events since the transform picking
>> XSLT can pass on the context to the next transform (standard transform
>> chaining); 3) I get a functional programming model (not an advantage to
>> some, I know).
>
> I don't get it: you say that your requirements are so horrible that you 
> need to keep all your rules into a database (which is a questionable 
> sentence right there, but I don't have details to judge it). Then you 
> say that a sitemap becomes a mess. Result: you want to write a XSLT 
> stylesheet that uses extensions to connect to a database to obtain 
> dynamically generated pattern-matching rules to transform an XML 
> representation of your request context into a directly-digestible output?

The rules don't go into the database, the rule selectors go into the
database. I don't think any extensions should be needed; the context data
will be created using standard generators and possibly aggregation, though
as we proceed we're finding that our generators inherit from each other and
aggregation isn't needed; each generator picks up what is needed
automatically. Likely, for the other cases we'll end up using composition in
the generators and eliminate aggregation in the sitemap.

> how that is goint to be any better than a sitemap+flowmap is *very* hard 
> to see from where I stand.

Better is a relative thing.  As we've sort of concluded, in some cases your
development requirements get messy no matter what way you go.  When you're
building systems for research it's often a case of picking the less of two
evils...

> moreover, it sounds like an optimal solution to kill your webapp 
> performances: that stylesheets becomes your bottleneck. So either you 
> write your own xslt engine (or extend an existing one) to be able to 
> optimize those database-extracted rules, or you're goint to have 
> *serious* scalability issues right there.

Evaluating all these rules is going to be a performance issue no matter how
we go.  Unfortunately, the new government privacy requirements force some of
this on us (we've still got a couple of years before they all go into
effect).  Similarly, the complexity of the research environment forces some
of this on us.  That's part of the "challenge" of working in research...

> In both cases, big implementation PITA.
> 
> End result? your people might know the XSLT syntax, but one thing is to 
> know the syntax of a language, another is to be able to read a 
> stylesheet that includes hard-core functional programming connected to 
> external datasource via extensions.
> 
> If you think that is going to be easier for you to find people able to 
> read/write/maintain those hard-core stylesheets than it will be to find 
> people that can learn the sitemap syntax and a read a few lines of 
> javascript, I think you have some thick walls to crash into in your 
> future :)

As we've discussed this is going to be messy either way.  If it was just a
few lines of JavaScript needed to do all the complex evaluation of the
current context data then it wouldn't be an issue.  However, to code the
processing we need would mean many 1000's of lines of JavaScript.
Now-a-days we've got good XSLT editors, schema validators etc. so the  job
of creating the XSLT isn't as hard as it used to be.  Functional programming
is always an issue (sigh)...

So far this is all working, the XSLT's keep getting smaller as we generalize
things out and discover new generic processing patterns.  To me that's a key
sign that we are on the right track (in the past I've also I've seen C++
code get smaller as it gets generalized but gains more function.)

<small snip/>

>> However, I could also see
>> how there might be situations where the serialization decision might 
>> be part of the new thingy, and thus the blocks discussion and how to 
>> hand off service calls becomes relevant.
> 
> It is *NOT* a transformer decision to drive the serialization process. 
> It's against both SoC and IoC! There is nothing planned for Cocoon 
> Blocks that will allow this to happen and as soon as I have to vote 
> around here, you'll get my -1 on anything that makes possible for one 
> pipeline component to modify dynamically the pipeline execution, 
> including choosing a serializer.

I'm not asking for the transformer to drive the serialization decision; we
definitely want to separate those decisions!  What I was saying is that if
you have a generalized way of extending the sitemap then the decision on
where to plug in the serialization becomes an issue.  It's the resources
discussion: a resource might do generation and transformation, or it might
do transformation and serialization, or whatever. In my case, the question
of whether blocks will allow this doesn't matter a whole lot, since for the
most part I think we can behave mostly as a pure transformer.  However, I
could possibly see a case where I want our new thingy to behave more like a
generic resource and take over more of the otherwise standard sitemap
processing.  

SOC shouldn't mean that the only place you can separate the transformation
and serialization decisions is in the master sitemap, some other
component/block might also have a good way of separating these decisions and
handling them...



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Mime
View raw message