xml-commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henri Yandell <bay...@generationjava.com>
Subject [Util] XmlWriter
Date Sat, 17 Aug 2002 23:45:26 GMT

Here's a post I just sent to the Jakarta Commons list. It contains some
code that I find useful in Xml but that I'm not sure if it's something
that is seen as 'good' by Xmlers.

I have three main classes. The first is an XmlUtils class that just
provides me with some simple String handling methods. EscapeXml,
removeXml, that kind of thing. Some very simple parsing methods too.

"Find me the next tag with the name 'name' and get me its contents" etc.

Secondly I have an XmlWriter class. I discussed it first in an article
[mentioned in the forwarded email below] and the basic idea is that I
don't see great ways to output chunks of Xml. I've seen too many projects
where people use StringBuffer to use it, and I hate the concept of having
to build up a DOM and serialise it just to write some XML to a file. I'm
sure people don't build up a StringBuffer or StringWriter in memory first
before writing a text file. A lot of the time they simply write each
String at a time to the file.
XmlWriter has improved recently with some help from Peter Cassetta. We've
taken cares to keep it simple and not take on too much.
My primary target for XmlWriter is a system at work where we send small
chunks of Xml from component to component.

The third class, which is not yet submitted, is a HtmlScraper class.
Really it's an XmlScraper class that doesn't parse the Xml. Instead it
provides methods by which you can jump to tags, read in partial bits of
tags, but ignore the fact that other parts of the Xml might change. The
primary use for this is in scraping Html without having to parse the Html
or worry if earlier bits of Html change in structure. It's described in an
article at:

It knows nothing about Html, but as it is a parser for bad xml without the
bad xml affecting the parsing code, it's probably not useful for much else
than Html scraping.

Anyways, opinions, disagreements, and comments on whether this is a good
thing, and something which fits with Xml-Commons would be much welcomed.
If there's anything else out there which handles Xml outputting [and not
Xml serialising], it'd also be good to hear about.



---------- Forwarded message ----------
Date: Sat, 17 Aug 2002 19:25:49 -0400 (EDT)
From: Henri Yandell <bayard@generationjava.com>
Reply-To: Jakarta Commons Developers List <commons-dev@jakarta.apache.org>
To: Jakarta Commons Developers List <commons-dev@jakarta.apache.org>
Subject: [Util] XmlWriter

I've finally gotten around to committing XmlWriter, a class which I think
is pretty basic and yet pretty useful. I described it a while back in an
article at:

I'm storing it in Jakarta Commons Util initially along with XmlUtils. I'm
starting to talk to the guys at Xml-Commons about both classes. They're
nothing fancy, but I think they hit a good level for people who will not
want to spend lots of time using SAX/DOM parsers and end up using String

Many thanks to Peter Cassetta for getting me moving on this again and
supplying benchmarks and pretty-printing.


To unsubscribe, e-mail:   <mailto:commons-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:commons-dev-help@jakarta.apache.org>

View raw message