xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Clark <an...@apache.org>
Subject [Announce] NekoHTML 0.4 Parser for Xerces2 Available
Date Sun, 14 Apr 2002 11:55:49 GMT
== About ==

NekoHTML is a simple HTML scanner and tag balancer that enables 
application programmers to parse HTML documents and access the 
information using standard XML interfaces. The parser can scan 
HTML files and "fix up" many common mistakes that human (and 
computer) authors make in writing HTML documents. NekoHTML adds 
missing parent elements; automatically closes elements with 
optional end tags; and can handle mismatched inline element tags. 

NekoHTML is written using the Xerces Native Interface (XNI) that 
is the foundation of the Xerces2 implementation. This enables you 
to use the NekoHTML parser with existing XNI tools without 
modification or rewriting code. 

The NekoHTML parser is available under an Apache-style licence.

== This Release ==

Changes from the last release include:

  * Added properties to control case of element and attribute 
  * changed behavior of parser so that only known HTML elements 
    have their names modified according to the properties -- all 
    unknown tags are left as-is; 
  * added property to set default encoding; 
  * added feature to augment infoset to report "synthesized" events; 
  * added feature to be able to report errors and localized the 
    error messages; 
  * implemented the locator so that location information can be 
    reported; and 
  * fixed element information so that more elements are properly 
    scanned as "special".

In addition, new documentation was written to demonstrate how
to take advantage of the new features and properties.

== Other ==

I thought I was going to just put in a few new features people
were asking for and then release it. But then I wanted to add
another feature and another and... Well, you get the idea. :)
So this release has very little change in terms of behavior
but includes a bunch of useful new features.

Have fun!

Andy Clark * andyc@apache.org

In case of troubles, e-mail:     webmaster@xml.apache.org
To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org

View raw message