Mailing-List: contact cocoon-dev-help@xml.apache.org; run by ezmlm
Precedence: bulk
Reply-To: cocoon-dev@xml.apache.org
Message-ID: <3C06AD65.70604@apache.org>
Date: Thu, 29 Nov 2001 16:49:25 -0500
From: Berin Loritsch <bloritsch@apache.org>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US;
 rv:0.9.6) Gecko/20011120
MIME-Version: 1.0
To: cocoon-dev@xml.apache.org
Subject: Re: data goes in, data goes out (RT in disguise)
References: <3C056815.F67BB0B9@apache.org> <3C067E84.C75FF9C1@apache.org>
 <3C068A0D.40103@apache.org> <E169Y1i-00020m-00@wahoo>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Peter Royal wrote:

> On Thursday 29 November 2001 02:18 pm, you wrote:
> 
>>In essence, we are merging concepts of Cocoon with the Handler concept of
>>Axis.  There is now no distinction between an Action and a Transformer that
>>acts on the request markup.  By arranging the order of these
>>Transformer/Handlers we can elegantly create powerful web applications. 
>>This might be more inline with the reactor pattern--but nothing like Cocoon
>>1.
>>
> 
> Let's take the classic posting data to a form example, how would this work in 
> a bidirectional XML pipeline? User posts data to the webserver, this is then 
> transformed into an XML request and an incoming pipeline to the 
> service(generator) is constructed. Would the Transformers on that pipeline be 
> destructive? That is removing data from the XML request and applying it to 
> the external store? (The opposite of a transformer that watches for tags and 
> reads from the external store).


It depends on the transformer/handler.  It can simply react to the markup coming
in and place it in the persistent store and pass the results along, or it can
act as a regular transformer and manipulate the XML passed through it.  The choice
(as dangerous as it sounds) is up to you.


> Should the incoming pipeline whittle the XML request down to nothing by the 
> time it reaches the service, or does the XML go through the service, with the 
> service acting as the U-turn point?


There are specific pieces of information the sitemap requires in order to
successfully process the information.  In essence, the best practice would be
to have all Action style handlers on the input pipeline and all Transformer
style handlers on the output pipeline.

By implementing the Action like the Servlet 2.3 Filter specification, we now
have an incredibly powerful and easy to use paradigm.  In other words, the
Actions are able to react to the incoming request based on the sitemap's instructions,
or better based on filters.  The filters are more generic than matchers and
selectors, in that they merely verify whether the associated handler needs to
react to this request.

For example, assume we have the following HTTP request:

POST /myform/results.html HTTP 1.1
Content-Type: text/xml

<myform>
   <foo>bar</foo>
</myform>

A filter can react on all "POST" requests to "/myform/results.html".  Another
filter could react to all "text/xml" content types.  The filter declarations
would either be in a sitemap, or mounted by a sitemap.  That way, the filters
that react to URIs only have to worry about relative uris so the same modules
can be used in different URI spaces without rewriting absolute URIs.  It almost
brings tears to your eyes.

Another important aspect of filters is that they do not necessarily need to be
applied linearly.  The reason is that Filters are used for determining which
Action reacts to the incoming request--and Actions have no determination on the
XML.

Perhaps I am jumping around here with Filter based Actions in one breath and
talking about Input Transformers in the next breath.  There can be a danger in
nondeterministic or asynchronous behavior, so Filters applied in any order may
prove to be an incorrect pattern.  However, you can apply deterministic chains
of actions so that unassociated actions can be applied in any order, but ones
that require a specific order can be preserved in the chain.  This also avoids
requiring us to step through the list of all filters at once--something that
I think would be too much overhead.  Again, this does require more in depth
thinking.

Currently we have the following declared needs:

Actions:      Pure logic handlers that have no affect on the XML.
Transformers: Reactor pattern for XML documents--on input they would merely
               need to read the XML, but on output may need to add snippets.
Source:       The target information the request was destined for.
Serializers:  Convert the Event based pipeline to a stream based pipeline.


> The other thing is how would caching play into such a beast? I think that's 
> an important concept to thing about up front. With SOAP its not really an 
> issue since more often than not you don't want to cache function calls, but 
> for web content caching is king. I guess if the input pipeline and output 
> pipeline are separate, then the output pipeline can still be partially 
> cacheable similar to how cocoon acts today?


It is hard to say.  The most important caching mechanisms are on the output.
However, large attachments are more efficient if they are cached to a temporary
filesystem for later manipulation.  It is a tradeoff that may need a good
cost/performance analysis (not in money but in effort/reliability).


> For webapps, this could be *very* powerful if attached to something like 
> XML-Schema, since you could then have an input transformer that does datatype 
> validation.


:)  But not all the time.  That is one of the primary reasons that validation
is turned off for the XML Parser.

For input markup, SUN envisions JAXB to be the better way to validate the
information.  In other words, you bind the XML to schema generated beans.
The beans know how to marshal/unmarshal the XML to and from the beans.  For
those familiar with traditional Java programming, this provides an attractive
interface.

It is more efficient to allow the Schema generated beans to validate the
hierarchy because they do not make any unnecessary tests.  When you combine
this to the melting pot of ideas, Cocoon now becomes a monstrous beast of
XML activity.  This could be good, and it could be bad.  The important thing
is to engineer Cocoon to the point that it works like magic ;p.


>>While I can appreciate the difference between Session and Request
>>variables, there are just too many access points to worry about. 
>>Environment simplification is another matter altogether.
>>
> 
> I enjoy having two different places to stick things now, as by making Request 
> attributes, I know they will all start as null on the next request. Of course 
> you could make the Environment have multiple setAttributes, one for 
> persistent attributes and one for the current active transaction.


My major point of contention is the distinction between request attributes and
request parameters.  This is especially true when you want to deal with information
that is not a string.  Case and point: file uploading.  You cannot get a file-handle
via the Request parameters interface--you MUST do it through the request attributes
part.  Problem is that not every servlet container implements request attributes
properly.

If the Request object attributes/parameters was paired down to simply this:

Object Request.get(String key);
void Request.put(String key, Object value);

we could simplify the environment more, and provide an implementation that always
worked in every servlet environment.  More importantly, these request attributes
and parameters would be WORM (Write Once/Read Many--like a cdrom burner).  That
way, you can pass new information between actions and transformers, but are guaranteed
that a rougue handler cannot overwrite your values with bogus ones.


-- 

"Those who would trade liberty for
  temporary security deserve neither"
                 - Benjamin Franklin


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org