xerces-j-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Clark <an...@apache.org>
Subject Re: Using loose.dtd and strict.dtd from xerces
Date Tue, 05 Nov 2002 22:41:46 GMT
Dariush Behboudi wrote:

> Hi everyone,
> I'm new to xerces and I'm trying to validate an Html file using w3c's
> dtds strict.dtd and loose.dtd.

HTML DTDs are written in SGML which is a superset
of what is allowed in an XML DTD. If you want to
use HTML but also perform validation, then I would
suggest using XHTML which is the XML version of
the HTML specification.

If validation is not important and you just want
to parse HTML documents in your application, check
out JTidy[1] and NekoHTML[2]. JTidy does a very
good job at cleaning up HTML files but is best
used for automatic conversion and accessing the
document using DOM. NekoHTML is a bit smaller and
offers you the ability to use the SAX API as well
as DOM. If appropriate to your needs, try both
and see which works best for you.

[1] http://lempinen.net/sami/jtidy/
[2] http://www.apache.org/~andyc/neko/doc/html/

-- 
Andy Clark * andyc@apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Mime
View raw message