xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aleksander Slominski <as...@cs.indiana.edu>
Subject Re: Progressive parsing
Date Tue, 27 Aug 2002 14:42:10 GMT
Paul Libbrecht wrote:

> Here's an example:
> <a>    <b><c>blop</c></b>     <b id="b1"><c>blip</c></b>
> To reparse only the content of b of id "b1" I can then feed to the
> parser:
>                         <a><b id="b1"><c>blip</c></b></a>
> thus avoiding the presumabily enormous first b element's content.
> (note, this doesn't mention what the parsing is actually, feeding, I am
> thinking of JDOM but one's free, just... sax events).

> I see at least two applications of this:
> - an xml source editor that has, say, a tree-view, could reparse much
> less thereby being much more responsive (try jEdit's excellent xml-mode,
> the parsing step is heavy!).
> - to make poor-man's (read-only) database of xml-content, it would be
> sufficient to build an index of the elements with an id which would then
> be fed responding to a query
> But is this good xml practice ?
> I am clearly loosing the ability to apply full-validation (that is, I
> could only revalidate the element's content, is schema exchangeable in
> terms of root element like a DTD is ? relax-ng schemas ?)


however that may not work if parser is validating and for example there
are explicit rules for children of <a> (like <a> must have two <b> children).

> Finally... to xerces makers/users: how do I get the byte position of an
> element declaration I've just been handed to by the sax parser ?

this is more complex as parser works on UTF-16 characters (char)
so obtaining position of original stream if it was not UTF-16 is very
difficult. however i think that for your cases it is enough to get position
of start/end element in character stream. ability to obtain position
is not currently part of xerces2 but you can take a look on my patch
that adds to XMLLocator function getCurrentEntityAbsoluteOffset()
that can be used to get current position of parser. together with
changes to XMLDocumentFragmentScannerImpl it is possible to
get start/end position of every XML event in XNI. for details see:




In case of troubles, e-mail:     webmaster@xml.apache.org
To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org

View raw message