xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Sergeant <m...@sergeant.org>
Subject Re: HTML Parser Update Available
Date Sat, 13 Apr 2002 10:49:47 GMT
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wednesday 10 April 2002 5:14 pm, Harald Hett wrote:
> Recently I searched the web for a HTML-Parser that is capable to parse
> dirty HTML in any way. But all that I found did not really convince me.
> Only JTidy seemed to fit. But none of those solutions produces a
> DOM-tree in the end, that can be easily modified by using the DOM-Api or
> a XSLT-Stylesheet. That is a nice and interesting feature of CyberNeko
> and makes it interesting for a lot of programmers.

Libxml2 parses dirty HTML and produces a DOM tree suitable for passing to 
XSLT, or for turning into XHTML, or re-rendering to HTML.

- -- 
<:->get a SMart net</:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iEYEARECAAYFAjy4DUsACgkQVBc71ct6Oyzv6gCgpeTpMLE3hbCVvBV858+7DMNZ
TQwAnRbpCByXZ3WcyOKO3tpEKvW5kxhl
=GLsL
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
In case of troubles, e-mail:     webmaster@xml.apache.org
To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org


Mime
View raw message