cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Fagerstrom <dani...@nada.kth.se>
Subject Re: [RT] Access to the object model in XTTL
Date Mon, 14 Apr 2003 08:21:55 GMT
Stefano Mazzocchi wrote:
 > on 4/10/03 3:46 PM Daniel Fagerstrom wrote:
<snip/>
 > I've taken a pretty serious look at XQuery and I think that it fits the
 > needs for what I wanted.

Ok, I'll take closer look at XQuery and the implemetations you pointed to.

 > Logically speaking, I agree that providing a coherent, tree-shaped and
 > read-only view of data from a template language makes perfect sense.
 >
 > Still, there are huge performance issues on the table. They are somewhat
 > related to the push vs. pull debate, which often tend to become
 > phylosophical questions.

Xalan actually has pull parsing while used together with Xerces
http://xml.apache.org/xalan-j/dtm.html#incremental. But as booth push
and pull parsing are of sequencial nature, and XSLT processors need a
random access representation of XML internally, the best idea from
performance POV is to have a lazy DOM adapter layer between Java data
structure and the XSLT processor. In this way DOM adapter object only
are created for the part of the data tree that actually is accessed by
the XSLT processor.

 >>Tree or forest ?
 >>----------------
 >>The XML view of the OM data can either be represented as one large XML
 >>tree containing all the XML views of the OM objects as sub trees or as a
 >>one XML object for each OM object, (or even parts of the OM objects). I
 >>think I prefer the forest view, as it suggests a more modular
 >>architecture for the various OM object to XML adapters.
 >
 >
 > You are stating that there is a difference between a folder and a
 > document, basically. Logically speaking, I think this is just another
 > metadata on top of a node and should not be reflected by the underlying
 > syntax.

Seem reasonable, I mixed in some implementation issues.

 >
 > This is also the road taken by JSR 170 in designing the Repository API,
 > which is, in short, a huge tree with granularity down to the single text
 > node of a DOM or to a single MPEG frame of a multi-Gb video stream.
 >
 > The problem they have is that they are now so abstract they are not sure
 > (yet) on how to query it :-) (unless they provide XML-ized views even
 > for those non-xml nodes, very nasty problem)

Ok

<snip/>
 > The problem is not SAX or dom. It's much worse than this.
 >
 > Suppose you have an XQuery template with something like
 >
 >  <html xmlns:c="http://apache.org/cocoon/xquery/object-model">
 >   <body>
 >    <form action="$c:om/flow/continuation/id" >
 >     <input type="text" name="skin"
 > value="{$c:om/session/style//profile[name='skin']}"/>
 >    </form>
 >   </body>
 >  </html>
 >
 > from what I read in the XQuery specs, the above is legal. If not, an
 > alternative could be to use
 >
 >  <html xmlns:c="http://apache.org/cocoon/xquery/object-model">
 >   <body>
 >    <form action="c:om()/flow/continuation/id" >
 >     <input type="text" name="blah"
 > value="c:om()/session/style//profile[name='skin']}"/>
 >    </form>
 >   </body>
 >  </html>
 >
 > Now: how would you implement the above? SAX or DOM?

In general I think it is better to use DOM (through a lazy DOM adapter),
if your data allready are on tree form, especially if you only are going
to access a small part of it. If the data is on text form or only is
available through an iterator like a row set from JDBC, SAX is a better
alternative. For your example this means that the continuation id should
be in a DOM tree. For the session data, I don't know, I need more info
about what kind of data you are storing in the session atribute.

<snip/>
 >>Parameters
 >>- - - - - -
 >>In Xalan and XSLTC, any type of Java object can be supplied to the XSLT
 >>processor as a named parameter. The externally supplied parameters must
 >>be declared with xsl:param statements on the global level to be used in
 >>the rest of the stylesheet, this rules out simplified stylesheets (the
 >>ones without enclosing xsl:stylesheet). Types like String, Boolean,
 >>Integer, Node, NodeIterator will be adapted to the corresponding XPath
 >>types and accessed with ordinary xpath expressions. Types that doesn't
 >>correspond to any XPath types can still be used by extension functions,
 >>reflection is used to find the right extension function.
 >>
 >>If we supply the object model as a parameter with the name "om", we can
 >>then access a request parameter "foo" by writing
 >>"java:org.apache.cocoon.environment.Request.getParameter(java:java.util.Map.get($om,
 >>"request"), "foo")", this requires that the parameter "om" is declared
 >>in the stylesheet and that the names pace "java" is defined. There is
 >>also a name space mechanism in Xalan that make it possible to use a
 >>short name space identifier instead of "package.Class".
 >
 >
 > I don't get this, can you elaborate more?

I'd basically tried to give a terse summary of how Xalan's extension
mechanism is used, it is probably better to refer to the documentation
http://xml.apache.org/xalan-j/extensions.html.


 >>Cashing
 >>--------
 >
 >
 > you mean: where you get the money? ;-) Sorry, couln't resist.
 >
:)

 >
 >>I think there are two main ways to find out what data the cashing of
 >>XTTL should be based on:
 >>One could explicitly list what data the cashing should depend on in e.g.
 >>the sitemap or the cashing keys could be inferred from the XTTL
 >>document. It might also be a good idea to be able to turn on and of
 >>cashing for the XTTL generator as it might be worthwhile to do a fairly
 >>complicated validity calculation for a page that is heavy to generate,
 >>but not for an page that is cheap to regenerate, (maybe there are some
 >>general mechanisms in the sitemap that already does that?).
 >
 >
 > XSP contains all this already.
 >
 > As a sidenote I have been thinking that making a non-xml syntax for XSP
 > might even be better because it was designed for generation and it does
 > have a bunch of machinery in place already.

If you only want to do the kind of access that XSP already does, this
seem to be a much better idea. I prefer to view all OM data as XML, and
with this POV, I believe that XSLT or XQuery are better tools.

<snip/>
 > Caching depends on the caching logic implemented by the various modules.
 > in order for an XSLT stylesheet to be cacheable, it should implement the
 > proper hooks so that the pipeline can call it.
 >
 > this sounds rather unfeasible to me.
 >
 > [snip]

Might be, what would be needed is an API for asking the XSLT processor
what URI:s that are used in include and import statements, for knowing
when to recompile the stylesheet, (this is already done in Excaliburs
XSLT processor wrapper, in a more indirect way, as I have explained
before). One need also a way to ask what URI:s that are used as
arguments to the document function, for being able to know when the
cached version of output from the XSLT processor is invalid. Such things
might be intersting for other projects that use Xalan.

<snip/>

 >>* If the document function can supply the XTTL generator with cashing
 >>info I believe that using the document function for OM input is the most
 >>attractive way.
 >
 >
 > I'm pretty positive this is not fully possible: the document() function
 > expects sources, which are not cacheable with Cocoon's highly abstract
 > strategy. They somewhat expect streams and estimate their ergodic period
 > using last-modification time. cocoon, is much more abstract than this
 > and it's not always possible to project it onto lastmodifiedtime.
 >
 > But this is a rather theorical point. Still, those templates will
 > probably never be cached if they use document() somewhere but this might
 > not necessarely be a bad thing since most XSP todays don't implement
 > cacheable anyway and nobody cries for cocoon performance.

Yes, I took a look at some of the logic sheets and saw that. Maybe it is
better to focus on getting the access fast and drop the cacheabilty
quesitions for now.

<snip/>

 > Daniel, while I do like this thread, I think there are much more
 > important issues to deal with right now. So, please, hold on any
 > high-research RT until we release or my lack of time will force me to
 > ignore your messages and a bunch of nice thinking could go wasted.
 >
 > TIA

No problem, I completely agree that a 2.1 beta is much more important
than a new template language. We continue the discussion after the release.




Mime
View raw message