Return-Path: Delivered-To: apmail-xml-cocoon-dev-archive@xml.apache.org Received: (qmail 49080 invoked by uid 500); 5 May 2003 22:46:24 -0000 Mailing-List: contact cocoon-dev-help@xml.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: cocoon-dev@xml.apache.org Delivered-To: mailing list cocoon-dev@xml.apache.org Received: (qmail 48991 invoked from network); 5 May 2003 22:46:23 -0000 Received: from e1.ny.us.ibm.com (32.97.182.101) by daedalus.apache.org with SMTP; 5 May 2003 22:46:23 -0000 Received: from northrelay04.pok.ibm.com (northrelay04.pok.ibm.com [9.56.224.206]) by e1.ny.us.ibm.com (8.12.9/8.12.2) with ESMTP id h45MkSWn110554; Mon, 5 May 2003 18:46:28 -0400 Received: from d25ml04.torolab.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by northrelay04.pok.ibm.com (8.12.9/NCO/VER6.5) with ESMTP id h45MkQ81230214; Mon, 5 May 2003 18:46:26 -0400 Subject: Re: [ANN] XInclude processor for xml-commons To: nicolaken@apache.org Cc: cocoon-dev@xml.apache.org, commons-dev@xml.apache.org, xerces-j-dev@xml.apache.org X-Mailer: Lotus Notes Release 5.0.7 March 21, 2001 Message-ID: From: "Neil Graham" Date: Mon, 5 May 2003 18:46:25 -0400 X-MIMETrack: Serialize by Router on D25ML04/25/M/IBM(Release 5.0.9a |January 7, 2002) at 05/05/2003 06:46:27 PM MIME-Version: 1.0 Content-type: text/plain; charset=us-ascii X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Hi Nicola Ken, Some of this I covered in the note I just wrote to Stefano; but a couple of brief comments: > ...which brings us to the real catch22 problem, as Xerces needs XNI but > all systems out use SAX. Why not only extend SAX? (ignorant mode) Well, SAX is its own community and will evolve (or not...) as it will. We needed a new kind of entity resolver, for instance; we could have extended SAX's, but then SAX decided to define EntityResolver2. This would have been a pretty ugly mess as soon as SAX 1.1 extensions become finalized! Besides, with things like the introduction of Augmentation parameters everywhere in XNI, I doubt we share many method signatures with SAX anymore. > Is it impossible to do validation of a document from a SAX stream? Or is > XNI compulsory? Well, it's compulsory for a component that lives in Xerces. I'm not entirely clear how you'd build a DTD grammar out of the information you get from SAX, but I've not tried to do this so it well might be possible. But of course it would be easy (well, as easy as it ever is...) to build a representation for an XML Schema grammar from a SAX stream. > Is it possible to do XInclude from a SAX stream? Or is XNI compulsory? Well, if you want to go by the letter of the spec, which Joerg tells us assumes validation to happen after XInclude processing, then it's only possible if you find a validator that takes SAX events as input. You could certainly build such a thing, but I'm not sure many exist ATM. Anyway, before responding any further on these threads, I need to spend some quality time with XInclude, and hopefully also have a look at Joerg's code. Cheers! Neil Neil Graham XML Parser Development IBM Toronto Lab Phone: 905-413-3519, T/L 969-3519 E-mail: neilg@ca.ibm.com |---------+----------------------------> | | Nicola Ken | | | Barozzi | | | | | | | | | 04/30/2003 06:21 | | | AM | | | Please respond to| | | nicolaken | | | | |---------+----------------------------> >---------------------------------------------------------------------------------------------------------------------------------------------| | | | To: | | cc: cocoon-dev@xml.apache.org, commons-dev@xml.apache.org, xerces-j-dev@xml.apache.org | | Subject: Re: [ANN] XInclude processor for xml-commons | | | | | >---------------------------------------------------------------------------------------------------------------------------------------------| Neil Graham wrote, On 30/04/2003 1.41: > Hi Nicola Ken, > >>>In your original post, you'd mentioned that you'd implemented this as a >>>"SAX filter"; did you mean that it's an XMLFilter implementation? If >>> so, >>>and you're using a standard XMLReader implementation of some sort, then >>>how do you do XInclude processing before validation? > >>Ooooh, opening a can of worms ;-) > > Both commons-dev and xerces-j-dev have been really quiet lately; about time > for some activity! :) :-D >>Yes having used XInclude I can solemly state that XInclude has to occur >>before validation. The question is: should a parser validate? ;-) > >>The fact is that Processors like Cocoon will always have other methods >>to use before a validation, so what we (Cocoon) need is instead a >>validating transformer, or something like that. > >>Not that I'm defining solutions, just that Xerces is monolithic seen >>from a SAX-adhering processor world. > > > I see what you're saying, I think: you like the idea of a componentized > parser that operates in a pipeline, but you wish that SAX had been used as > the glue for such a beast, rather than its own API. Right? > > And if SAX had seemed rich and complete enough, and had it supported the > kind of configuration-management facilities that are needed for such an > arrangement, I imagine that's the road we would have gone down. But it > isn't/doesn't, so what was there to do? :) Yeah, I know... I read about XNI, so I'm aware of the reasons it was created... > So I guess I'd turn the question around: If using SAX means you're stuck > with monolithic-looking parsers, but XNI would give you a parser with all > kinds of flexibility, maybe using XNI might merit consideration? :) Sure > it would bind you to a specific parser (until other parsers begin to use > XNI :)) ), but if you write your own SAX-glued parser then you're stuck to > that anyway... ...which brings us to the real catch22 problem, as Xerces needs XNI but all systems out use SAX. Why not only extend SAX? (ignorant mode) >>Dumb proposal: why not a XNI2SAX Filter that uses an XNI processor for >>SAX? > > We already have one: org.apache.xerces.parsers.SAXParser; heck, it even > understands both SAX1 and SAX2! :) The trouble is that, once you're > emitting SAX, you're not in the Xerces pipeline world anymore; so this guy > is only useful at the end of such a pipeline (whatever XNI components that > pipeline has in it). So there's no way of putting a Xerces validator after > that component, for instance (to do so would kind of defeat the point of > XNI). Does that explain anything? Is it impossible to do validation of a document from a SAX stream? Or is XNI compulsory? Is it possible to do XInclude from a SAX stream? Or is XNI compulsory? Is there a real possibility of hacking the SAX pipelines to make this happen for *some* components like validation or XInclude? > Here's hoping I haven't let too many more worms out of that can... No, you have been very on-the-spot :-) -- Nicola Ken Barozzi nicolaken@apache.org - verba volant, scripta manent - (discussions get forgotten, just code remains) ---------------------------------------------------------------------