Return-Path: Delivered-To: apmail-xml-general-archive@xml.apache.org Received: (qmail 89402 invoked by uid 500); 27 Aug 2002 14:42:02 -0000 Mailing-List: contact general-help@xml.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: general@xml.apache.org Delivered-To: mailing list general@xml.apache.org Received: (qmail 89389 invoked from network); 27 Aug 2002 14:42:02 -0000 Message-ID: <3D6B8FC2.21D4CBB0@cs.indiana.edu> Date: Tue, 27 Aug 2002 10:42:10 -0400 From: Aleksander Slominski X-Mailer: Mozilla 4.79 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 To: general@xml.apache.org Subject: Re: Progressive parsing References: <881654EC-B92B-11D6-9FD0-0003934D43BA@activemath.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Paul Libbrecht wrote: > Here's an example: > blop blip > > To reparse only the content of b of id "b1" I can then feed to the > parser: > blip > thus avoiding the presumabily enormous first b element's content. > (note, this doesn't mention what the parsing is actually, feeding, I am > thinking of JDOM but one's free, just... sax events). > I see at least two applications of this: > > - an xml source editor that has, say, a tree-view, could reparse much > less thereby being much more responsive (try jEdit's excellent xml-mode, > the parsing step is heavy!). > > - to make poor-man's (read-only) database of xml-content, it would be > sufficient to build an index of the elements with an id which would then > be fed responding to a query > > But is this good xml practice ? > I am clearly loosing the ability to apply full-validation (that is, I > could only revalidate the element's content, is schema exchangeable in > terms of root element like a DTD is ? relax-ng schemas ?) hi, however that may not work if parser is validating and for example there are explicit rules for children of (like must have two children). > Finally... to xerces makers/users: how do I get the byte position of an > element declaration I've just been handed to by the sax parser ? this is more complex as parser works on UTF-16 characters (char) so obtaining position of original stream if it was not UTF-16 is very difficult. however i think that for your cases it is enough to get position of start/end element in character stream. ability to obtain position is not currently part of xerces2 but you can take a look on my patch that adds to XMLLocator function getCurrentEntityAbsoluteOffset() that can be used to get current position of parser. together with changes to XMLDocumentFragmentScannerImpl it is possible to get start/end position of every XML event in XNI. for details see: http://www.extreme.indiana.edu/xgws/xsoap/xpp/download/PullParser2/lib/xerces2_patched/ thanks, alek --------------------------------------------------------------------- In case of troubles, e-mail: webmaster@xml.apache.org To unsubscribe, e-mail: general-unsubscribe@xml.apache.org For additional commands, e-mail: general-help@xml.apache.org