cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: Flowmaps: the wrong approach
Date Tue, 04 Dec 2001 14:47:40 GMT
Daniel Fagerstrom wrote:

> > On the other hand, I disagree that FSM equals goto-programming (in fact,
> > you are describing FSM down below, using the XML syntax :)
>
> Do you think so? In an FSM I can go from any state to any other state but in
> the examples below I cannot go from a stage inside a loop to a stage before
> the loop, e.g. Furthermore if we use a stack, (which is only needed if we
> allow for general recursion or flowmaps that are seperatly compiled), that
> gives more "power" than an FSM has. Or am I missing something?

Well, we are probably have a miscontextualization problem, so allow me
take a step back:

Ovidiu stated that programming a FSM is a pain. You agreed on that and
used the goto-is-bad antipattern as an example of this.

But a FSM is nothing but another way to express a Turing machine.

So, by transitive logical rules, you guys are saying that programming a
Turing Machine is a pain. Which clearly can't be used as an argument,
given the million different ways there exist to program a turing
machine.

Back to cocoon: Ovidiu said that we should follow those patterns on
continuations and avoid page-centric programming.

So, if in your context FSM programming means "page-centric programming",
I totally agree with both of you that we should "get control back" when
building webapps.

This goes along with anti-SoC: if a web application logic is scattered
around many files, the concern is separated where it should *not* be!

I fully support this notion of recompacting the webapp logic concern.

At the same time, the papers that Ovidiu showed us, aggregate back the
web-app logic concern that forget to keep the other concerns separate.
So, we are back on the old problems of web-app frameworks that happen to
be so programmer-oriented that only programmers can work on them.

And this is not what we want.

We want a balanced system where concerns are well separated but a single
concern is not scattered around tens of files.

> > We *must* take into consideration try/fail by providing the ability to
> > update the form page if some data inserting error is made. This is vital
> > for webapp usability.
>
> Yes, I agree completely, error handling is _very_ important. I have been
> thinking a lot about handling of data insertion errors, and were about to
> throw an RT on it. I didn't, however, think about it from a "flowmap
> perspective" before, so I am still not clear about how I would like to
> integrate it.

Neither do I, but let's keep talking about it and maybe we'll get our
collaborative mind-waves collapse into some solid quantum state. :)
 
> Anyway, here are the main lines:
> 
> * The webapp gets input from form fields, an uppload or a soap like call
> (are there more possiblitys?). The input data consists of strings in
> parameters, a document containing some (hopefully) structured data, (e.g.
> tab separated numbers), that we have decided to accept as input, or if we
> are realy lucky, XML-data.

It doesn't really matter: we can provide generators that adapt on the
request payload and format the result into a standardize markup that we
define. Along the lines of what you were proposing before.
 
> * This data should, rather sooner than later, be transformed to XML. After
> all we are in the _XML_-weapp buisiness :) This seem to be an obvious task
> for a generator. (In this context I actually prefer the term deserializer,
> or something similar, as you mentioned, althoug not advocated, in your
> original post in the "Data goes in data goes out" thread).

Yes, it's a 'deserializer' but it's not any different (at least
behaviorally) from a generator (which is a slightly more general term
anyway).
 
> * Now that the input data are in xml-form we could, or IMHO should, have an
> XML-schema (or your favorit schema language) to validate the input data
> against. We don't want to put data in the wrong format in our data base or
> in our java programs, do we?

is this calling for a ValidatingTransformer or we need some specific
sitemap semantics for validation of pipeline stages?

I ask because I can't picture what happens if we have a transformer and
the input is invalid. Sounds like a transformer is not enough as a
behavioral interface for this task, but maybe I'm wrong.
 
> * So what is the result of the validation? There seem to be three main
> cases:
> + The input is valid - let the it flow to the next step in the pipe.
> + The input has invalid structure - this means that we have a fatal input
> error. An error that it is hard to recover from or to report anything
> sensible from. If we have designed the client side, a structural error means
> that we have a bug in our system or that someone try to post data whithout
> using our client. In booth cases we can not do much more than logging what
> is relevant, and report a fatal error to the user. If we offer a webservice
> or allow for uploads we probably have to work a litle bit harder on our
> feedback to the user.

Exactly. It would dead simple to have the transformer generate some
markup explaining the errors, but this is *far* from being ideal from a
user-friendly point of view.

> + The input has valid structure but invalid data types in the text fields or
> in the attributes. This is the case you asked for above. This case is more
> complicated, we have to give the user detaild feedback on whats wrong and a
> possibilty to update the faulty data fields. Two possible ways for the
> validator to inform the rest of the system about what went wrong, are:
> A list of location path, error message pairs. This can describe all kinds of
> field errors, but it is not obvious to me how to make use of the
> information. Another possibility is to only allow user input within elements
> and not in attributes, in this case the input xml can be annotated with
> error attributes in the faulty elements.

Annotating is not a problem even for attributes since we can use
namespaced attributes for that.

The problem is controlling the sequence of components in the pipeline
when a mistake is found. This calls for a "conditional" component.

Hmmm, thinking out loud: can't we throw "selectors" into this?

hmmm...
 
> I think that the validator should be a transformer, it takes xml as input
> and, except for fatal errors, emit xml. It could be a part of the
> generators, but thats leads probably to overly complex design of the
> generators. Xerces2 is actually build as a pipline with plugable components
> (not Avalon components however), where the pipline can consist of components
> like a scanner, a DTD validator, an XNL-Schema validator etc, and where the
> pipeline components components communicate with XNI events, that are like
> SAX-events but somwhat more low level. After having browsed the relevant
> parts of the Xerces2 source code I belive that it should be possible to
> reuse some of the components to build a "error annotating" validator, but I
> am not completely certain yet.

I agree, the design of Xerces2 is very elegant and very useful, but I
would point to Relax NG implementation which is designed exactly for
that: validate an XML infoset at any processing time.
 
> * More complicated validation that check e.g. dependences between fields
> could be based on the "bind" elements from XForms and be put in another
> transformer.

Ah, double validation, sounds like a cool idea:

 - a generator produces the SAX evens using request parameters using
some markup we define (hey, could also be useful for multi-part payloads
and for file uploads!) and using their own namespace.

 - a ??? validates the markup structure and datatypes
 
 - another ??? validates the content and augments the infoset with
eventual error information

Now let's come up with an idea for ???:

 - it can't be a generator because it has SAX input
 - it can't be a transformer because it must "route" the output
 - it can't be a selector because it doesn't work on the stream

Gosh, this seems to imply need another component or we have to extend
some component behavior.

The new component should be an hybrid between a selector and a
transformer: a selector that is also capable of transforming what passes
thru.

We could extend selector functionality in that direction, let's make an
example:

 <map:match pattern="form">
  <map:generate type="form"/>
  <map:selector type="relax" src="form.relax">
   <map:when test="invalid">
    <map:transformer src="structure-errors.xsl"/>
    <map:serializer/>
   </map:when>
   <map:otherwise>
    <map:selector type="xform-validator" src="form.xml">
     <map:when test="invalid">
      <map:transformer src="form-errors.xsl"/>
      <map:serializer/>
     </map:when>
     <map:otherwise>
      ...
     </map:otherwise>
    </map:selector>
   </map:otherwise>
  </map:selector>
 </map:match>

where '...' further processes the input (still foggy that part for me,
unfortunately).

What do you think?

> So, now comes the crux: the steps this far seems to be quite naturally
> described in terms of a pipeline. But now we have to make a choice on where
> to pipe the reults, if the validation succeded the results should be piped
> to the "DoIt"-transformer and if we got field errors the results should be
> piped to a "partly filled in form with error messages"-transformer. This can
> defentively not be done in the same pipeline.
> 
> AFAIK, but I can have missed something important, if you want to build
> something like what I outlined above in Cocoon today you have to use a
> number of actions instead of generators and transformers. That obscures the
> pipline aspect of serialization and validation. As an example, the
> "StreamGenerator" would be usefull as a part of an XForms handeling pipe,
> but if I decide to validate, I would have to build a "StreamAction" that
> places its result in the model or in the session in some place that the
> "ValidatorAction" has to know about.
> 
> So what would I like to have instead, (and appologize if I have missed
> important aspects of what you can do with whats currently available in
> Cocoon), is something like the following:
> 
> * An "input pipline", like the one I described above, that is required to be
> side effect free and only dependent on input. The output of the
> input-pipline is stored in a datastructure (a DOM-tree I guess). Ok, that
> hurts, but I cannot see any choice, we can only know about the result of a
> validation after having seen all the data, and till then we have to store it
> somewhere.
> 
> * Now we can have flow control, (a selector maybe?) based on the result of
> the validation, and also things like XPath-expression aplied on the input,
> and on the global state of the system.

Hey, we're having the same ideas here (sorry, but I normally don't read
the entire message before replying, to avoid being influenced by others)
very cool :)
 
> * Based on the selection in the flow control, the input structure is
> unpacked to SAX-events again in the choosen pipeline. And for this step we
> have a "real" pipeline again that is free to perform any side effects it
> want to.
> 
> I think that the outlined concepts should integrate well with a continuation
> based flowmap engine, but I start to get to tired to be able to explain how.
> 
> Ok, i seem to have written my RT anyway, it was not my intension :)

Oh, well, I think it didn't hurt :) Quite the opposite, I would say.
 
> > Hmmm, as a personal taste, I'd rather pass the continuation hashcode as
> > a hidden parameter of the form, so that it doesn't "pollute" the URI. Of
> > course, we can't let the user take care of this so we must come out with
> > something for this.
> 
> Agreed, I thought that there was a need for one hashcode for each
> continuation on a page with multiple links, but they are anyhow
> distinguished by their URI:s, and they are even connected to the same state.
> 
> To clearify:
> * For state less pages, we don't need any hashcode.
> * For non copyable continuations, the hashcode will, AFAIK, correspond to
> the session id, so we can let the session handling system take care of it
> instead.
> * For copyable continuations we need a new hashcode for each page.
> 
> > What about using XForms directly and provide our own transformations to
> > HTML forms that take care of everything? (they could even add
> > client-side javascript validation code)
>
> Yes, absolutely, I have written such a system together with a colegue. It is
> based on a small subset of an early draft of XForms. From the
> XForms-document we create an XSLT-document (actually by using another
> XSLT-document :) ) that generates a populated HTML form when it get
> XML-instance data as input. It also it also uses the kind of error
> attributes that I mentioned in my "embeded RT" above.
> 
> I hope that I will find time to refine and update our things so that I can
> submit them.

Uh, that'd be great.

> > I see value in what you explain, but the use of XSLT an variable
> > expansion language is, IMO, a little bit overkill since no
> > transformation is taking place.
> >
> > What do you think about Velocity instead?
>
> Yes, Velocity is probably better for this kind of things, at least if it can
> use XPaths, (I have just browsed the documentation so I don't know much
> about it). For my own case I happen to be an XSLT-fanatic and uses it for
> all kinds of stuff where other languages might be a better choice ;)

Well, choice is the key: if you like XSLT, great, but let's keep the
mechanism that performs variable substitution pluggable so that
everybody chooses what they like the best and nobody complains :)

> > > This situation can be handled by restricting the creation of new
> > > continuations, so that one copy of a continuation is allowed for each
> > > stage in the flow.
> >
> > Yes. Even if it is cool to using continuations to avoid the need to
> > check for back and cloning, I see very little value in letting the user
> > clone the window without having finished the previous flow.
> >
> > I see no problem in forbidding this by restricting the creation of a
> > single continuation.
>
> I got carried away of the article, cloned pages are defenitely not the main
> use case for flowmaps, although we actually use it, (but based on other
> methods), in a datawarehose application on the company that I work for.

Uh, really? what for. I couldn't really figure out a useful use of these
(rather than performing multi-threading internet browsing, which is what
I normally do, but I avoid having two browser windows on the same webapp
like the plague (knowing the complexity in handling such a thing,
expecially for cookie-based applications)

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<stefano@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Mime
View raw message