cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Berin Loritsch <>
Subject Re: [RT] Cocoon web applications
Date Fri, 05 Oct 2001 14:40:35 GMT
Stefano Mazzocchi wrote:
> > Ok.  I can agree with that statement.  Keep in mind that for Cocoon
> > app installation you have to modify both the unarchived war and the
> > archived war file.  The reason is that SOME servlet containers ignore
> > the original war file once it is deployed.  SOME servlet containers
> > overwrite the contents of the directory with the contents of the war
> > file.  And still OTHERS act like the second scenario until the unarchived
> > directory is modified.
> >
> > Don't you just love it when there is no standard?
> We have more than one Apache members listed on the JCP 053 which is
> responsible for the creation of that spec. If you think it's important
> to specify what behavior the servlet container should take when
> deploying a webapp, please, let's discuss a formal proposal that we
> might submit to the group and that might make into Servlet 2.4

I believe that the "work" directory was proposed for normal access to
writable areas.  The down side is that the location of the directory
is not easy to predict.  You can work around the spec with direct file
access, but that is not easily portable between environments and you
have to place entries in the policy file granting you access to those
locations.  I think that a Servlet that manages it's own internal context
might be worth exploring.  There are issues such as overwriting your
own files, etc.  I don't know if it is wise to open it to all servlet
developers.  Before I can come up with anything formal, I need to know
what environment I need.

> > That is what I am referring to.  As of Servlet 2.3 and much debate, the
> > official stance on where "/resource" maps you is to the web server root,
> > not the context root.  Instead, the context root is much more difficult
> > to reach.  Perhaps we can improve the HTML serializer to automagically
> > correct context root resources.
> Yuck! I'd hate that. Serializers that mangle things behind your back are
> the worst pain in the ass to find out expecially because you never look
> at them since you normally consider them brainless and pure adapters
> from the XML world to the binary world.
> Let's find a more elegant way.

OK, any ideas?

> > Let me expound.  I like to use a dierectory structure like this:
> >
> > /xdocs
> > /resources
> >       /images
> >       /scripts
> >       /styles
> > /stylesheets
> >       /system
> > /WEB-INF
> >       /logicsheets
> >       /cocoon.xconf
> >       /logkit.xconf
> > /sitemap.xmap
> > /${}
> >       /xdocs
> >       /resources
> >             /images
> >             /scripts
> >       /sitemap.xmap
> >
> > The problem is when I want a consistent look and feel in my ${}
> > area.  I cannot access the /stylesheets that are accessible via the
> > context--but not via the sitemap.  This requires me to copy the
> > /stylesheets to the ${}.
> Ok, in this case, absolute URI would work and will not require you
> access to your parent, but to an absolute location (which, in this case,
> accidentally, happens to be your parent)
> This is a simple fix and we can schedule it for Cocoon 2.1 since it
> might break back compatibility of sitemaps a little.

Sounds good.

> > Because Cocoon is an XML framework, in order for this approach to work,
> > you have to define the interfaces.  There are definite roles that I
> > have already identified.  Some of the solutions come from concepts in
> > SOAP, and some of the solutions come from concepts in JNDI, but here goes.
> >
> > For sub applications to work, you must have them work to a specific schema.
> > (this concept is from SOAP).  For instance, your resource must return
> > the results in DocBook format so that the parent knows how to apply views.
> > This is the interface of your "component".
> I've already thought about this when I thought about a way to validate
> sitemaps and it's a *LOT* more complex than this.
> Let's make an example: the "behavioral interfaces" of pipeline
> components are the expected input namespaces and the resulting
> namespaces. But listing them is not enough: you must know the exact
> structure, thus the namespace-aware schemas.
> Even between components, schemas are the structure description that
> identify the expected "shape" of the SAX pipe that connects two
> components.
> Now, suppose you have a pipeline such as
>  <g] -> [t1] -> [t2] -> [s>
> and you have
>  g -> output schema of generator
>  t1i -> input schema of first transformer
>  t1o -> output schema of first transformer
>  t2i -> input schema of second transformer
>  t2o -> output schema of second transformer
>  s -> input schema of serializer
> with all this information you can precisely estimate if the pipeline is
> "valid", in a behavioral sense.
> This would allow you to perform some pretests on sitemaps (before
> compilation and before uploading) that avoids those "impedence
> mismatches" between connected components.

This is excellent--validation is vital!  I know my practices, and I tend
to use existing schemas, only inventing if necessary.  When I do invent
a schema, I always have it generated by a logicsheet and provide a
transformation to the main document schema.  This works for me, because
it is a known environment.

What you are talking about is validating that not only I am doing my
job right, but other people in my team don't make simple mistakes.
The only thing is that the validation shouldn't be done in live serving.

I think we do need to have schema validation on during development (esp.
when designing new schemas) to ensure the app works, but have it off for
deployment--something the deployment tool can ensure.

> As more and more Cocoon components emerge and are made available even
> outside the Cocoon distribution, the ability to estimate the "behavioral
> match" between two components, will very likely be vital, expecially for
> sitemap authoring tools.
> The algorithm that performs the validation is far from being trivial: a
> sufficient condition (and the most simple one) requires the connecting
> ends to be identified by the exact same schema.
> So, the above pipeline would be valid *if*
>  t1i == g
>  t2i == t1o
>  s == t2o
> but this is not a necessary condition since there exist cases where a
> pipeline is behaviorally valid even if the two subsequent schemas don't
> match completely, but only on parts.

Just to add a little more complexity to the system is now that we have
namespaces, we have multiple schemas in one document.  Therefore, the
transformation and serialization layers must be even more specific.

As an example, let us use a recent real life scenario.  I created a Cocoon
app that manages schematics (maps of where products go on a retailer's shelf)
and the location of the schematics (which shelves in the store use the
schematic).  As a result, I had to create a schema for the schematics and
the location (sharing a namespace).  It was not uncommon for the generator
to produce a document with the document schema and the schematic schema.
Your validation code has to be further expanded to include namespace
resolution like this:

Document ns:  [doc]
Schematic ns: [schem]
XHTML ns:     [xhtml]
Any ns:       [*]

g[doc][schem] ->
t1i[*][schem] ->
t1o[doc]      ->
t2i[doc]      ->
t2o[xhtml]    ->

This is actually a simplified pipeline (the real one used aggregation for
the menu, etc).

Using this approach of specifying the expected schemas and the output schemas,
we can go beyond simple validation, and do automatic discovery and use of the
needed transformation layers.  That way, when a generator mixes several schemas
together (I had one instance where I had up to five in one document--different
project), I don't need every request to go through the whole transformation
chain.  Let's take the same example above, and add a separate "location" schema
to the mix:

Document ns:  [doc]
Schematic ns: [schem]
Location ns:  [loc]
XHTML ns:     [xhtml]
Any ns:       [*]

UnOptimized                              Optimized
-----------------                        -----------------
g[doc][schem] ->                         g[doc][schem] ->
t1i[*][schem] ->                         t1i[*][schem] ->
t1o[doc]      ->                         t1o[doc]      ->
t2i[*][loc]   ->                         t2i[doc]      ->
t2o[doc]      ->                         t2o[xhtml]    ->
t3i[doc]      ->                         s
t3o[xhtml]    ->

The end result of both pipelines is the same, so the different path does not
affect the cache validity.  If the original doc had all three source schemas,
then the full path would have been used.  This has one more added optimization:
if the t[*][loc]->[doc] transformer changes, it does not affect the cache
validity of the optimized path.

> In fact, the input schema might work only on part of the previous output
> schema, for example, working only on one namespace and leaving the
> others elements pass-thru unchanged.


> But in this case, in order to be possible to continue the validation,
> the output schema must state what can be left pass thru.

Not necessarily.  If you use my example above, the namespaces used are all
declared in the generator.  To show the how the validator would work with all
three schemas in use check this out:

Schematic ns: [schem]
Location ns:  [loc]
XHTML ns:     [xhtml]
Any ns:       [*]

g[doc][schem][loc] ->
t1i[*][schem]      ->
t1o[doc][loc]      ->
t2i[*][loc]        ->
t2o[doc]           ->
t3i[doc]           ->
t3o[xhtml]         ->

As you can see, the validator tracks the namespaces used at each OUTPUT point.
This g, t1o, t2o, and t3o.  It is easy to track the document namespaces.  The
big thing is that if a transformer or generator uses any intermediate namespaces
during processing, it needs to clean up after itself.  For example, the esql
logicsheet or SQLTransformer use a namespace to describe how pull information
from a database--however none of that information is transfered in the document
markup.  Currently, the generator calls the start and end namespace for the
logicsheet/transformer, but no elements are passed using the namespace.  This
presents added complexity to the validator.  We might be able to use the
SAXConnector approach to strip the unnecessary namespace arguments.  That
would require caching the SAX calls until the namespace is closed or the first
element using the namespace is found.

> I don't want to get deeper into these details, but I just wanted to show
> you that establishing behavioral composition on pipeline components is a
> lot more complex than you described.
> But, yes, it can and needs to be done.
> > StreamResources: Take any source and goes completely through serialization.
> >                  This is basically an alternate for Readers, although it
> >                  can also be used for generated reports.
> >
> > FlowResources: A mounted flowmap that performs all the logic necessary for
> >                a complex form.  It handles paging, etc.  It is a type of
> >                compound resource in that it pools several simple resources
> >                together, and returns the one we are concerned with at the
> >                moment.
> >
> > URIMapResources: A compound resource that maps URIs to specific simple
> >                  resources.
> >
> > SitemapResource: A compound resource that is a sub sitemap.  Sitemaps are
> >                  completely self contained, so it is near impossible to
> >                  override their results.
> I'm not sure about these, though. Could you give me some pseudo-example
> of a pseudo-sitemap and how it would use the above?

My thinking on a StreamResource was that the sub cocoon app would completely
handle that resource.  So whether that resource was a Reader or a full pipeline
does not need to be known by the parent.

As to markup, I am not sure yet.  We need a conceptual model that works before
we can express the markup.

> > A sub application can specify resource adaptors for it's native XML generators,
> > for instance you might have a document schema and a schema for an inbox.
> > The If the parent has a View that recognizes the inbox schema, then it will
> > directly use that schema.  If not, the sub application will specify a default
> > mapping.
> >
> > Hopefully this is enough to get us started.
> I understand very well the concept of schema-based adaptation, but I
> think I lost you on the other resources, I think a couple of dirty
> examples will get me closer to your point.

Hopefully, I can model it in ASCII....

+--------------------+ get(stocking-section) +---------------------------+
| Root Cocoon App    |---------------------->| Stocking Section App      |
| schema: [doc][loc] |<----------------------| schema: [doc][loc][schem] |
+--------------------+    rcv([doc][loc])    +---------------------------+

In the above "diagram", the root Cocoon app is designed to accept the
[doc] and [loc] schemas (to carry on the previous examples), but has no
knowledge of the [schem] schema.  The Stocking Section App is registered
to output [doc], [loc], and [schem] schemas.  If the whole app is engineered
to the [doc] schema (that being the target), Stocking Section App would
provide adaptors for the [loc] and [schem] schemas to convert to the end
[doc] schema.  If the parent app and the child app register the expected
schemas with each other, the sitemap will return any schemas that can
be handled natively.

IOW, Root registers [doc] and [loc] with Stocking.  Stocking configures
itself so that it does not transform the [loc] schema--assuming that the
parent knows how to handle it.  However, because the Root did not state
that it could handle [schem] schema, Stocking applies the transformation
for that.

> > > In short, you are asking for more solid and wiser contracts between web
> > > applications and I believe that an absolute URI space accessing is
> > > already a solid contract, but the proposed role-based addressing is a
> > > killer since it allows strong contracts and still complete flexibility
> > > and scalability.
> >
> > Yep. Well defined contracts reduce cognitive dissonence.  Too many contracts
> > increase cognitive dissonence.
> Careful about using that term: "cognitive dissonance" is a good thing on
> many situations since modern learning theories give it the role of
> difference maker between short term and long term learning.
> In fact, they suggest that something gets learned only when there is
> cognitive dissonance and your brain must work to overcome it, normally
> by creating the abstraction that make it possible to make the two
> cognitive concepts resonate and overlap with your existing semantic
> environment.

See what they polute your minds with at school?  Keep in mind you are talking
to someone with an Associates in the Recording Arts.  Psyche was not part of
the lesson plan (however, psychoacoustics was...).

I get your point though.

> I'd love to continue research on this topic by letting practical things
> like  real-life user experience as well as more theorical things like
> cognitive science influence our decisions on how to make this project
> evolve.

I have practical knowledge and real-life user experience.  I'll have to rely
on your expertise for the cognitive sciences.  I know _some_ of the concepts
because I have mentored others--but not nearly the detail you do.

To unsubscribe, e-mail:
For additional commands, email:

View raw message