cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <>
Subject Re: [ANN] XInclude processor for xml-commons
Date Tue, 06 May 2003 03:28:07 GMT
on 5/5/03 5:38 PM Neil Graham wrote:

> Hi Stefano,

Hi Neil,

thanks very much for answering this.

> Sorry about the delay; I was really hoping that Andy Clark (the "Father of
> XNI") would pick up this thread.  He'd do a much better job of contrasting
> XNI with SAX, and especially giving you the historical angle (I came on the
> scene a bit after XNI had gotten out of the crib, so by then the decision
> to break from SAX was already made).


>>Can you tell us why SAX is not powerful enough for what you need?
> Well it's turned out to be pretty handy to be able to pass "augmentations"
> along with the events representing the infoset. 

Can you make an explicit example of this? that would help us understand

> SAX doesn't have any kind
> of configuration-management infrastructure:  If you want to set up a
> pipeline of XMLFilters then you have to manage propagation of features and
> properties to the components that comprise the pipeline on your own; 

Hmmm, I see, but why do you pass such information thru the pipeline and
not as a pipeline context?

> sorts this out.  SAX is also somewhat impoverished for people who care a
> great deal about the lexical layout of DTD's; there isn't much you can't
> get in this respect from XNI.  Although SAX 1.1 extentions (still in
> alpha!) could rectify this, as it stands it's not possible to determine the
> version or encoding of a document with "stable" SAX interfaces.

I see. But the very fact that I, for one, never noticed those
limitations might seem to advocate against such a move.

>>It does. But Cocoon is *entirely* built around SAX. Moving to XNI is an
>>incredibly difficult task. Nobody will do that just for the sake of it.
> I can understand that.
>>And since Cocoon almost never validates and cocoon already has xinclude
>>transformers built around SAX. I really don't see a need for such a
>>massive transition.
> I think my question revolved around Joerg's assertion that the spec thinks
> of XInclude processing being done before validation.  

I'm a follower of the JClark-ish school of thought that validation
should be orthogonal on respect of the infoset. I heard that this school
of thought is penetrating the validation circles at W3C, which is, IMO,
a good thing even if it moves the problem on the a pipeline definition
language and the XPipe note, well, it cries for abuse.

So, for now, what I personally advocate is to avoid the use of
infoset-messing validation as much as possible.

> If you don't do
> validation of any kind anyway, then clearly this won't disturb you; but if
> you did--or wanted to follow as closely to the spec as possible--then you'd
> be confronted by the fact that most SAX parsers have validation built in,
> not built as a separate module that you can drop an XInclude processor in
> front of.

Yes, that is the reason why I advocate not to validate at all using
infoset-messing schematas. In fact, for schema validation needs, I
advocate the use of a RelaxNG SAX filter and along with the cocoon
sitemap syntax, you can precisely describe how your pipeline behaves
(and you don't have those nasty pre/post-schema-infoset issues)

>>Now, please, help us understand: what are the differences between XNI
>>and SAX and what would we gain basing the entire cocoon architecture
>>around XNI events instead of SAX events?
> At this stage, I'm afraid the biggest pro might be that SAX just doesn't
> seem to be all that healthy these days; 

While I don't question your assertion, I think it might be misleading:
there are APIs which became solid out of lack of necessity to improve.
>From a parser-writer POV SAX might seem dead while from a non-validating
SAX user POV (as a cocoon developer is), SAX is just complete.

But I do see your point.

> I don't know if you lurk on its
> development lists, but they've been pretty much dead for quite some time...

No, I don't, nor ever felt the need to, to be honest.

> But I'll leave the hard job of selling XNI to Andy; it works well for us,
> but perhaps SAX is good enough for what you need it to do.  

Yes. I can only speak for myself (and I encouradge others to speak up if
I'm mistaken) but I never felt the need for something better. (I do have
some issues with the fact that SAX is lossy in respect of the original
whitespace between attributes and attribute order, but that's not an
issue for cocoon)

> Cocoon being
> the huge project it is, I certainly wouldn't blame you for needing some
> very solid reasons to migrate to a different pipeline framework!

Yes. we would need *incredibly* solid arguments to do such a transition
without risking a huge fork that would kill us. This is why I asked: in
all honesty, it's much easier for us to turn off parser validation
entirely and provide a Jing SAX filter than moving everything to XNI to
make internal Xerces filters into cocoon pipeline components.

And if the need ever emerged, we could use the XNI/SAX adapters that
already exist, like we do for Andy's HTML parser.

Anyway, thanks for anwering. There hasn't been much communication
between the cocoon and xerces communities but both are well funded on
xml-event-driven pipelines, even if seen from different points of view,
so any idea/suggestion/criticism exchange can only be a good thing for both.



View raw message