cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: Substractive view labels (long)
Date Wed, 27 Mar 2002 18:56:40 GMT
Sylvain Wallez wrote:

> Let me put back the explanations I gave to Volker Schmitt, with some
> more details. The below sitemap will be used for these details
> (high-level structural elements skipped for simplicity) :
> 
> <map:view name="content" from-label="content">
>   <map:transform src="content2html.xsl"/>

you missed a <map:serialize> here, or am I wrong?

> </map:view>
> 
> <map:resource name="foo">
>   <map:transform src="foo.xsl" label="content"/>
>   <map:serialize/>
> </map:resource>
> 
> <map:resource name="bar">
>   <map:act type="updatedatabase"/>
>   <map:transform src="bar.xsl"/>
>   <map:serialize/>
> </map:resource>
> 
> <map:pipeline>
>   <map:match pattern="foobar">
>     <map:generate src="foobar.xml" label="content"/>
>     <map:act type="findtype">
>       <map:call resource="{type}"/>
>     </map:act>
>   </map:match>
> </map:pipeline>
> 
>                             ---oOo---
> 
>  From a user point of view, knowing when branching will occur can become
> a nightmare since you have to crawl all branches that can participate in
> a request handling (matchers, selectors, actions, resource calls, etc)
> in search for this last label.

This is done as well for views that are connected to the 'last'
component, since you don't know what that is unless you've executed the
pipeline completely.

> In the above sitemap, there's a "content" label on the generator, but
> the "foo" resource also has this label. If the last label is used, you
> cannot know by reading the "foobar" pipeline if the view will start at
> the <map:generate> or not. You have to examine all possible branches
> (and in the above case, they're dynamic) to find other places where the
> same label is used.
> 
> Using the first label makes the behaviour more predictible : if a
> labelled statement in the sitemap is reached, then we *know* that the
> view starts at this statement.

Granted, althought we all agree (as you rightly point out below) that
this behavior leads to 'inelegant' uses of the sitemap semantics with
the creation of different components (and their pools) just because of
different view behaviors.

>                             ---oOo---
> 
> Implementation will be difficult, as it requires the whole regular
> pipeline (the view-less one) to be built before deciding at which point
> should occur branching.
> 
> I agree that specs shouldn't be constrained by implementations details.
> However, we must be aware that this requires some big changes in the
> existing pipeline architecture to "break" the regular pipeline at a
> point. But the important point here is that we need to fully build the
> regular pipeline to know the branching point (see below).

Ok

>                             ---oOo---
> 
> Corollary to the previous point, building the regular pipeline may have
> some side effects (e.g. actions) _after_ the branching label, but we
> cannot know beforehand that these actions shouldn't have been executed
> because they're not in the view.

Ok

> This is illustrated in the above sitemap : the "bar" resource has an
> action that modifies the system state, but since there is no "content"
> label in the "bar" resource, the view starts from the generator, that is
> *before* the action in the sitemap flow. Should this action be executed
> when the view is requested ?

Yes, it must be in order to understand "which" label to exit from. Even
implementing some 'subtractive view labels' is it perfectly legal to
have something like the above (with not explicit subtraction) meaning
that if the action returns 'bar', the 'content' view is associated to
the first generator, but if the action returns 'foo', the 'content' view
is associated to the transformer of the 'foo' resource. (see more on
this below).
 
> This leads to an interesting question. In "retuning sitemap design" (see
> http://marc.theaimsgroup.com/?l=xml-cocoon-dev&m=101057440717758&w=2),
> you classify sitemap statements in two main categories : direct
> components (generators, transfomers, etc) and indirect components
> (matchers, selectors, actions, etc). How does view handling relate to
> this classification ?

Views are orthogonal to pipelines, they aren't designed to stop the
execution of the pipeline somewhere for performance reasons, but just to
provide access to internal points of the pipeline in an explicit and
well-determined way.
 
> In the current definition of views, there is no difference between
> direct and indirect components, and the sitemap is executed up to a
> matching label that causes a jump to the view. Components after the view
> label, both direct and indirect, aren't executed.

Eh, I know this: in fact the original design of the views was supposed
to stop execution at the last encountered label, not the first one!

> If we change the behaviour and branch from the last label, this means
> that *all* indirect components of the regular pipeline are to be
> executed because they perform the routing in the sitemap and there
> execution is therefore required to find the last label.
> 
> Is this really what we want?

Yes, I see no way around this.

> My opinion is no : this would make
> understanding views and predict there behaviour really difficult. 

Yes, this is a valid concern, but we must make sure to implement the
best solution for the problem and I personally believe that the 'exit
from last label' behavior is the cleanest way to implement this, even if
this, admittedly, makes view behavior less explicit.

> Views
> are a powerful concept and many users already have difficulties to
> master them. Turning them to black magic won't promote their use.

Absolutely, but the view semantic in the sitemap was not intended to be
verbose and explicit, but rather implicit so that their behavior could
be easily inherited by subsitemaps (see below).
 
>                             ---oOo---
> 
> Conclusion (thanks for those who have read all of the above ;)
> 
> I consider the label-on-component feature a writing facility to avoid
> tedious repetition in the pipelines. 

Yes, it was placed there for that reason and for more (again, see
below).

> The documentation sitemap (with
> "file" / "file-nolabel") clearly shows that views are attached to a
> particular DTD that exists at some places in the pipelines, and that
> attaching their labels to general-purpose components like the resource
> generator may not be a good thing : 80% of the uses of that generator
> produce the correct DTD, but we need to be able to handle the remaining
> 20% without sacrificing the writing facility. That's why I suggested
> these "substractive labels" to avoid the declaration of a new component
> and the associated overhead.
> 
> Also, I didn't find in the archive the reasons for this "move to last
> label" todo. And I wouldn't be happy if this was proposed as a
> workaround for the label-on-component problem. We should not constrain
> the definition of views by the bad side-effects of a writing facility.

Ok, these are all great points and must be addressed in full detail.

                             ---oOo---

Views, by design, must apply to entires collection of pipelines. They
must be general enough to have a significant meaning if projected on top
of every pipeline.

For this reason, Views are defined *externally* from the pipelines and
can be seen as 'generator-less' pipelines, where the generator is
performed by part of the pipeline on top of which the view is projected.

It is the view responsibility to indicate *from where* the view should
connect to the pipeline. There are two different ways to indicate the
"exit point" of the pipeline by the view:

 1) positional: 

       first -> right after the generator
       last -> right before the serializer

 2) indirect (labelled):

       from a first occurrence of the specified label

the 'first' positional is the easiest to implement, but it's useful only
on trivial cases.

the 'last' positional is much more important and requires the entire
pipeline to be executed, but it can be considered as a 'serializer'
substitution by another more complex serializer (the view's
generator-less pipeline).

the 'labelled' indirect location connects the input of the view (which
globally is a consumer of SAX events) to the output of the component
which has that label, either explicitly written, or inherited from the
component definition (either in the current sitemap, or in the closest
sitemap parent).

If a pipeline has one and only one label (either explicit or implicit),
there is no problem.

We must address the cases where more than one component has the exact
same label.

                             ---oOo---

If a view is connected to a label, the pipeline must have these labels
attached by the pipeline writer since it's not the view responsibility,
in this case, to provide a specific positional element (it doesn't make
sense to have a positional view call to the 'second' or 'third'
component in the pipeline!).

The choice I made in the original design was to attach those labels
implicitly to the components at instantiation time. This made the system
more implict and views harder to understand from the sitemap directly,
but made the system much less verbose and views property easier to
'inherit' from subsitemaps.

In fact, this was the main reason rather than verbosity: I was afraid of
people *not getting* the view concept at first, so I made it possible
for them to 'inherit' their capabilities without having to write
anything more than what they were writing.

And I think this worked very well, expecially since I was able to
implement the command line functionality without even letting people
know I was using specific 'views' on top of their samples (and they
didn't even know what views were)

                             ---oOo---

When I was designing the view concept, I was looking at resources from
the 'user-agent' perspective and wanted to provide a way for them to
'unlock' the resources and have access to specific 'views' of them.

This looking into the resource from the outside (looking into the
serializer, so to speak), triggered the solution of having the pipeline
stop at the 'last' encoutered label, which is the "first" label that the
'user-agent' would encounter if it was scanning the pipe from the
outside-in.

Why so?

Well, the server uses the pipelines to *augment* the information and
shape it until it's ready for consumption (in case of POST/PUT requests)
or production (in case of GET/HEAD requests).

The 'user-agent' looks at the pipelines from the outside-in and wants to
connect to 'different' or less processed 'pipeline stages'.

With this vision in mind, it looks very elegant to provide this 'exit
from last label' behavior, because is the behavior that a 'user-agent'
would expect from decomposing the pipeline from the outside looking in.

                             ---oOo---

I completely understand that implementing this harder and somewhat
inefficient compared to the 'exit from first label' behavior, but we all
agree that this behavior is a bad one and must be eliminated.

So, there are two proposals so far on the table:

 1) Stefano's "exit from last label"
 2) Sylvain's "subtractive labels"

My proposal doesn't require any change in the sitemap semantics, while
Sylvains's requires pipelines to explicit indicate to 'remove' labels
that are implicitly defined at the component definition level. 

So, taking the current example:

   <map:match pattern="body-todo.xml">
     <map:generate type="file-no-label" src="xdocs/todo.xml"/>
     <map:transform src="stylesheets/todo2document.xsl"
label="content"/>
     <map:transform src="stylesheets/document2html.xsl"/>
     <map:serialize/>
   </map:match>

which uses the ugly hack of having a label-less generator type, in my
proposal the above would be rewritten as:

   <map:match pattern="body-todo.xml">
     <map:generate type="file" src="xdocs/todo.xml"/>
     <map:transform src="stylesheets/todo2document.xsl"
label="content"/>
     <map:transform src="stylesheets/document2html.xsl"/>
     <map:serialize/>
   </map:match>

while in Sylvain's it would be rewritten as:

   <map:match pattern="body-todo.xml">
     <map:generate type="file" src="xdocs/todo.xml" label="-content"/>
     <map:transform src="stylesheets/todo2document.xsl"
label="content"/>
     <map:transform src="stylesheets/document2html.xsl"/>
     <map:serialize/>
   </map:match>

Sylvain's point is that 'subtractive labels' are easier to understand
since they are more explicit and their implementation is easier because
those labels indicate to the sitemap engine what to do.

Let me rewrite the above sample:

 <map:resource name="foo">
   <map:transform src="foo.xsl" label="content"/>
   <map:serialize/>
 </map:resource>
 
 <map:resource name="bar">
   <map:act type="updatedatabase"/>
   <map:transform src="bar.xsl"/>
   <map:serialize/>
 </map:resource>
 
 <map:pipeline>
   <map:match pattern="foobar">
     <map:generate src="foobar.xml"/>
     <map:act type="findtype">
       <map:call resource="{type}"/>
     </map:act>
   </map:match>
 </map:pipeline>

where I removed the explicit label 'content' from the pipeline
generator.

Here, the *wanted* behavior is to have the 'content' view associated to
the generator for the 'bar' type and to the transformer for the 'foo'
type.

If we explicitly subtract the label from the generator, the behavior for
the 'bar' type becomes undefined.

This is to show that 'subtractive labels' might create more harm than
good for those admittedly complex examples where understanding view
behavior is already complex due to the implicit-ness of labels.

So, at the end, I don't think that my proposal is a choice to defend a
poor choice of labelling, but rather the way the view system was
designed from day one (but was not implemented for technical
difficulties, the same ones that Sylvain is facing right now and I
totally agree they are rather complex).

But I don't think the proposed subtractive labels make the system any
more elegant and the implementation any less hard.

But, of course, this is only my very personal perception of the problem
so I'll be very interested in seeing what others think about this.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<stefano@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Mime
View raw message