cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hunsberger, Peter" <Peter.Hunsber...@stjude.org>
Subject RE: [ANN] XInclude processor for xml-commons
Date Tue, 06 May 2003 14:51:42 GMT
Stefano Mazzocchi <stefano@apache.org> writes:

<big snip/>
> 
> > But I'll leave the hard job of selling XNI to Andy; it 
> works well for 
> > us, but perhaps SAX is good enough for what you need it to do.
> 
> Yes. I can only speak for myself (and I encouradge others to 
> speak up if I'm mistaken) but I never felt the need for 
> something better. (I do have some issues with the fact that 
> SAX is lossy in respect of the original whitespace between 
> attributes and attribute order, but that's not an issue for cocoon)
> 
> > Cocoon being
> > the huge project it is, I certainly wouldn't blame you for needing 
> > some very solid reasons to migrate to a different pipeline 
> framework!
> 
> Yes. we would need *incredibly* solid arguments to do such a 
> transition without risking a huge fork that would kill us. 
> This is why I asked: in all honesty, it's much easier for us 
> to turn off parser validation entirely and provide a Jing SAX 
> filter than moving everything to XNI to make internal Xerces 
> filters into cocoon pipeline components.
> 
> And if the need ever emerged, we could use the XNI/SAX 
> adapters that already exist, like we do for Andy's HTML parser.

Stefano,

I think it's possible we're going to have to end up re-evaluating SAX
someday in any case.  Obviously (? ;-) such a decision would be a Cocoon 3
type of thing, but this gets back to the push/pull lazy evaluation
discussion we've touched on in the fairly recent past:  if you really wanted
to have a common XML database in the guts of Cocoon then I don't think SAX
would suffice as an API.  The pull semantics don't work, you need something
that supports a DOM like traversal model (or better an abstract Xquery type
of traversal?). 

I suspect wrappers/adapters won't work, since I think the XML database
management mechanism is going to need access to information that won't make
it through such a model.  More importantly, if you have such a thing, much
of the way the end user thinks can change: document and URI resolution in
general is no longer any more expensive than any other access to the data
and things like sitemap aggregation can go away (don't have to, just can) in
favor of more standard XSLT type data access.  In such a case, DOM like
assumptions become the norm on the pull side and SAX like reactive patterns
only make sense on the push side.

The other possibility is that a new processing model such as STX makes such
an argument irrelevant but I don't think that will happen: among other
things, STX is an optimization to solve some of the problems of DOM type
assumptions made by XSLT.  As such it attacks only one axis of the
space/time optimization problem and doesn't give you a way to solve issues
such as cache normalization.  I could see an approach that says we'll just
optimize for memory usage -- processing will take care of itself -- but that
seems to make little sense now that we're about to see 64 bit address spaces
become commodity type capabilities...

Mime
View raw message