cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ricardo Rocha <rica...@apache.org>
Subject [RT] Rationalizing XSP (was: XSP and aspects)
Date Fri, 02 Mar 2001 14:49:40 GMT
Hi guys,

In response to the enlightening "XSP and aspects" thread posted by Allan
Erskine and Stefano Mazzocchi, I'd like to summarize and comment on the
various issues and goals discussed:

- Capturing aspects in XSP
- Making XSP a general-purpose factory for server components (other than
  server pages)
- Refactoring XSP implementation to avoid "the same but different"
  syndrome visible in the redundant implementation of server pages and
  the compiled sitemap
- Avoiding "language impedance" by placing only pure markup in XSP pages
  with no recurse to the underlying programming language
- Defining a logicsheet language: SiLLy

I apologize if the ideas presented here look too
implementation-oriented,
Cocoon-biased or Java-biased. Despite this, I feel most of the concepts
discussed are applicable to XSP at large (and, therefore, may be also
applicable to AxKit's Perl XSP)

1) Capturing aspects in XSP

In Allan's words, capturing aspects in XSP boils down to choosing markup
to represent and encapsulate what would otherwise be a crosscutting
feature.

To begin, XSP is generally seen as a (source) code-generation markup
language.

  NB: Implementation-wise, though, it could be an _object composition_
      language instead, with no need to resort to code generation
      except, perhaps, for supercompilation-like optimizations. (I'd
      like to elaborate on this later, on a separate post: a
      composition-oriented approach to dynamic XML generation based on
      Avalon patterns.)

AspectJ (our reference for AOP) acts as a Java preprocessor thus also
"incurring" in code generation.

Aspects observe and react to object events such as method invocations or
 exception handling. AspectJ achieves this by inserting aspect-support
code at appropriate spots (join points) so that the crosscutting feature
is accounted for in a transparent and uniform fashion across different
classes.

  OT: With the introduction of dynamic proxies in Java 1.3,
preprocessing
      is no longer the only obvious option: it's now possible to trap
      method invocations and exception throwing using dynamic proxies.

XSP code generation can be easily used for the same purpose although, as
Stefano points out, XML syntax is very unfriendly with code (the
"language impedance" issue commented below)

Consider XSP page debugging: the XSP code generator could well adorn
generated code to keep track of source lines in the XSP page and all
intervening logicsheets. Thus, when an exception is thrown, this
information would be used to dynamically generate an error page
displaying the appropriate context for the generated source program,
the original XSP page and all relevant logicsheets: an XSP developer's
nirvana! :-)

In fact, this wouldn't need to be the code generator's responsibility:
an otherwise "regular" logicsheet could be applied that scans all
<xsp:expr> and <xsp:logic> tags in order to append location info to the
generated source program. All in an aspect-oriented fashion.

Of course, there are many more implications to making XSP aspect-aware,
but one could say that the basic mechanisms needed to achieve are
already in place.

If we refine the logicsheet applying mechanism and (ideally) also
disallow embedding raw code in XSP pages (i.e., we base all dynamic XML
generation on logicsheets) then capturing/representing aspects would
become much simpler. (This is even more obvious under the composition
model mentioned above.)

2) Making XSP a general-purpose factory for server components

Probably the most wasteful single mistake in the existing (Cocoon) XSP 
implementations is limiting code generation to Producers (C1) and
Generators (C2).

This undesirable "feature" stems from the fact that the last-stage
logicsheet applied (the one dealing with the XSP tagset as such) also
contains the compilation unit skeleton.

If we decouple XSP from the target source program type and the
invocation environment (e.g. http/servlet) then the XSP language could
be used for building practically any kind of markup-generating program.

To achieve this, we'd simply need to follow an order like the following
in the XSP code generation pipeline:

a) Apply aspect-oriented logicsheets. This step will surround
   (aspect-oriented) dynamic tags with appropriate aspect-support logic
   (in the target programming language).
b) Apply dynamic tag substitution (non-aspect) logicsheets. This step
   will translate dynamic tags to their corresponding <xsp:logic> and
   <xsp:expr> equivalents for the target programming language.
c) Apply the compilation unit skeleton logicsheet. This step will build
   a specific program type (Generator, Transformer, other). It's here
   where we'd decouple XSP from its current server pages
   single-mindedness
d) Apply environment-specific logicsheet(s). This steps results in the
   inclusion of declarations that make the calling environment's object
   model readily available to the generated program
e) Apply the last stage XSP tagset logicsheet. This step will produce
   the final source program.

This sketchy pipeline pattern can be extended to support different XML
API's (e.g. SAX, DOM, JDOM)

It's also based on the assumption that a single target programming
language is used. Btw, Sam Ruby has long advocated the convenience of
allowing for multiple programming language support, a feature already
present in Websphere's JSP implementation.

Note that this would allow for the reuse of logicsheets (actually,
taglibs) written in different languages in the same XSP document.

I'm still working on exploring different alternatives, but it seems that
the factors [aspects? :-)] involved here are:

- Program type (compilation unit skeleton: Generator, Transformer, etc.)
- Programming language(s) (Java, Javascript, NetRexx, etc)
- Invocation environment (http/servlet, command-line)
- XML API's (SAX, DOM, JDOM)
- Taglibs/logicsheets
- Aspects

3) Refactoring the XSP implementation to avoid redundance

Currently, there are 2 overlapping, redundant "markup languages" in
Cocoon2: XSP for server pages and XSP for the sitemap. Allan Erskine
also mentions a possible "flowmap" which (in the current setup) would
require the creation of a third MarkupLanguage implementation.

Historically, the MarkupLanguage abstraction was meant to signify
different code generation tagsets, namely: XSP (<xsp:logic>, <xsp:expr>)
and JSP (<jsp:scriptlet>, <jsp:expr>). Admittedly, even for this case it
was unnecessary: it would be much simpler to translate JSP code
generation tags into their XSP counterparts without the need to
introduce the notion of a code-generation markup language (SOM!)

When the need for sitemap compilation was perceived and its
implementation pursued, "MarkupLanguage" was understood as what I've
called above the "program unit skeleton" so a new MarkupLanguage
(pirated from the server pages XSP implementation) was created:
SitemapMarkupLanguage.

In fact, this was the only possible workaround that allowed the existing
XSP code generation framework to be reused.

Should I have timely decoupled the code generation engine from the
target program type, we wouldn't have a separate sitemap MarkupLanguage
today. Again:SOM! :-(

Fortunately, the solution is obvious: deprecating MarkupLanguage and
adding a separate program skeleton logicsheet step in the XSP code
generation pipeline.

This way, both the sitemap and server pages can reuse the code
generation machinery without redundance.

This is also true for "new" program types such as Transformers and, yes,
Allan's flowmap.

4) Avoiding "language impedance" in XSP

This is a tough one: server pages developers seem to love the ability to
embed "raw" source code in markup, a trend originally set by M$'s ASP
and later embraced by JSP and XSP.

While this has the appeal of enabling quick prototyping, we have come
to suspect that _any_ language-specific programming constructs in XSP
break encapsulation and are, therefore, WRONG!

Embedding code in markup not only breaks encapsulation: it hinders
reuse!

Language impedance shows up in other ways as well. Probably the ugliest
one is the need to escape markup metacharacters (&, <, >) or the
alternative enclosing of source code in CDATA sections.

Both XSP and JSP provide a way to encapsulate the abovementioned
programming constructs: taglibs (logicsheets.)

Taglibs (and their implementing logicsheets) are much more than a nice
way to allow non-programmers to write dynamic server pages: they are
_the_ proper way to express dynamic (i.e., logic-based) content
generation in XML.

Associating taglibs with namespaces, for instance, supports very
powerful constructs: multi-dimensionality, aspect orientation...

This whole point seems to boil down to the following:

  All that can be achieved by embedding code in markup can also be
  achieved by inlining the invocation of a suitably defined method
  on a suitably defined object.

As it turns out, such "suitably defined object" doesn't need to be a
markup-aware wrapper around an application object or component.

There are many ways to transparently convert regular objects to markup
(e.g., using introspection in Java, a la Castor). In cases in which a
more "specialized" markup version of the object is required, extending
a regular class to implement XMLFragment might suffice.

In other cases in which arguments to a "suitably defined method" are
themselves markup, an object representation of such markup (like
bytecode
compiled xml) is appropriate. Note that this scenario typically
coincides
with the case in which one needs to do selection or iteration over
markup.
This should make it unnecessary to come up with procedural constructs
such
as "<xsp:if>" or "<xsp:for-each>" that are proposed from time to time.

All that said, embedded code can always be replaced by method
invocations
on _components_.

Correspondingly, it is always possible to associate (logicsheet)
namespaces with component instance declarations so that dynamic tags
are mapped to method calls on such components. (These concepts are
closely related to the composition-oriented approach based on Avalon
patterns and mentioned above.)

Last, but not least, pretty much as embedding code in XSP pages is
unnecessary (and undesirable), it's also true that:

  generating free-from code in logicsheets is also unnecessary (and
  undesiderable).

I think that logicsheets should limit themselves to generate method
calls on component instances. I have the impression that logicsheets
should/need _not_ generate arbitrary statements, much less generate
inner classes or the like.

Why?

First: it's not necessary; anything that can be achieved by building a
class from a template can be achieved by properly configuring an
existing generic component.

Second: for the same reasons that apply to embedding code in XSP pages,
anything that can be achieved by generating multiple statements can be
achieved by a method call.

Advantages:
  - Code size is dramatically decreased
  - Logicsheet complexity is dramatically decreased
  - No conflict with variable names previously defined by the same or
    other logicsheets

In conclusion, avoiding the code/markup impedance can be achieved by:

  - Outlawing embedding code in markup in XSP pages
  - Basing code generation (i.e., logicsheets) on emitting code that
    simply interacts with well-defined components

5) Defining a logicsheet language: SiLLy

   NB: Logichseets are far simpler in AxKit (Perl). Boy, are they lucky!

XSLT-based XSP logichseet authoring has turned out to be a true PITA
(tm).

XSLT is _incredibly_ powerful, despite the claims of those who only seem
to understand procedural transformations (cousins of the ones advocating
the infamous <xsl:script> tag which, btw, I've seen only in M$'s
implementation)

  OT: There has been a controversy about XSP vs XSLT extensions. The new
      controversy about XSLT vs XQuery promises to be enlightening in
      this regard: while nobody wants redundancy, I also wonder whether
      there's a "one size fits all" attitude. A juicy subject, indeed,
      but that deserves a separate discussion...

Powerful as XSLT is, its use in code-generation logicsheets doesn't come
without problems.

Granted, most transformational patterns characteristic of code
generation
can be correctly and completely expressed in XSLT without resorting to
extension functions (wich, btw, are currently available only in Xalan).

Their implementation, though, is usually too verbose and convoluted.
To illustrate the point, remember how one must declare dynamic tag
parameters so that they can be referenced as XSLT variables in source
code templates: tedious, repetitive, error-prone.

String manipulation is also a problem area. Think of the complexity seen
in the "get-nested-string" utility template.

Indentation in generated code is also a problem. While, for most
languages, this is just an aesthetic consideration, there are other
cases in which it can become a real problem: think of [J]Python.

A higher level language is called for and we all have known it for a
while now. That's what SiLLy is about.

The simplest alternative is for SiLLy to be preprocessed to XSLT: that
way we simplify logicsheet authoring while preserving the power and
portability of XSLT.

In order to transform SiLLY into XSLT, an XSLT stylesheet can be used.
Using XSLT to generate XSLT is a possibility contemplated in the XSLT
spec itself.

This approach has been addressed before (admittedly in a very
superficial way.)

If we restricted XSP code generation to component declarations and
method invocations, SiLLy could become much simplified:

  <logicsheet ns-uri="http://plenix.org/xsp/date-format"
              interface="org.plenix.sll.DateFormatter">
    <implementation ns-prefix="date-format"
                    class="org.plenix.sll.DefaultDateFormatter"/>
    <tag name="format" method="doFormat">
      <arg name="date" type="java.util.Date"/>
      <arg name="mask" type="java.lang.String" default="hh:mm:ss"/>
    </tag>
  </logicsheet>

  <logicsheet ns-uri="http://plenix.org/xsp/system"
              interface="org.plenix.sll.SystemInfo">
    <implementation ns-prefix="system"
                    class="org.plenix.sll.DefaultSystemInfo"/>
    <tag name="current-time" method="getCurrentTime"/>
  </logicsheet>

  <page>
    <p xmlns:system="http://org.plenix/xsp/system"
       xmlns:date-format="http://org.plenix/xsp/date-format">
      Hi there!
      To the best of my knowledge, it's now
      <date-format:format mask="hh:mm">
        <date-format:date><system:current-time/></date-format:date>
      </date-format:format>
    </p>
  </page>

To be honest, the above example is a simplification of the syntax
I'm testing for the composition-based approach I've been mentioning.
This approach does not require code generation: it uses the same Avalon
component managemenet patterns Cocoon2 itself is based on.

Anyway, a special-purpose, "manual" transformer (in the Trax sense) may
be appropriate if we base everything on components/methods: XSLT may no
longer be the best choice for processing logicsheets.

Un saludo tropical desde la Tierra del Olvido!

Ricardo

Mime
View raw message