xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Boag/CAM/Lotus" <Scott_B...@lotus.com>
Subject Re: Using own HTML parser with XSLT
Date Mon, 03 Jul 2000 02:39:27 GMT

> The idea is that the HTML file -> XSLT -> own processing -> XML file
> be fast and not require an DOM tree to be built.

At some point a tree needs to be built in all current XSLT processors
available.  In order for a tree not to be built, you would have to use a
subset of XSLT that can be "streamed", and the processor would have to
pre-analysize the stylesheet first to make sure it wouldn't need to keep
parts of the tree around.  Xalan currently doesn't do this (nor do any
processors that I am aware of, though I could be wrong).  What can be done
is an incremental parse/transform so that the parse can be done in fairly
small blocks, and the transform occurs, when possible, on the partially
built tree.  Xalan 2.0 will be very good at this -- with input SAX events
for the source tree, while Xalan 1.0 only does it with the DTM, which
doesn't fit your bill.


                    Tobias Wahlström                                                    
                    <tobias.wahlstrom@price        To:     "'general@xml.apache.org'" <general@xml.apache.org>
                    runner.se>                     cc:     (bcc: Scott Boag/CAM/Lotus)
                                                   Subject:     Using own HTML parser with
                    06/30/2000 05:53 AM                                                  
                    Please respond to                                                    


I want to read a HTML file using my own HTML parser. The result of the
parser should be XSLT processed and then another layer of processing should
be applied  and then written to a file. I would also like to use the parer
to produce a DOM tree at some times.

The idea is that the HTML file -> XSLT -> own processing -> XML file should
be fast and not require an DOM tree to be built. The preferred way should
to use some kind of event driven mechanism. The intention is that I want it
to be rather fast and not so memoty consuming.

Is this possible using Xalan (and Xerces)?
Is the SAX interface DocumentHandler usable to accomplish this?
Might the precopiled stylesheets (StylesheetRoot) be used?

If the questions above are completly stupid but you understood the
background I wrote above - please tell me what I should know instead of
answering my questions ;-)

Best regards,
Tobias Walström
Developer @ pricerunner.com

In case of troubles, e-mail:     webmaster@xml.apache.org
To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org

View raw message