cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruno Dumon <>
Subject Re: Using htmlArea 'output' with SVG
Date Wed, 24 Nov 2004 08:44:28 GMT
On Wed, 2004-11-24 at 09:25 +0100, Ugo Cei wrote:
> Derek Hohls wrote:
> > Brunor - I thought Ugo was the once who came up 
> > with the code we were talking about - its called "HTMLparser"
> > (was attached to a previous email) ... what is the difference
> > between these two??
> Without having seen Bruno's code, the difference is probably that mine 
> is a quick and dirty solution that got the job done for me when I needed 
> it, whereas Bruno's is a reusable, well-documented, efficient component :)

Nah, the HtmlCleaner serves a different purpose alltogether. It starts
with parsing the input using NekoHTML, but then performs further
filtering, conversion and restructuring on it to have a nice output,
limitted to a subset of the HTML dtd. At the end it serializes it
pretty, ie whitespace collapsing, line breaks at a certain width, etc.

One consequence is that if you enter the same text in Mozilla or IE,
you'll get the same textual output (there's still some small things left
that need to be fixed), allowing to do source-diffs on the edited

Thus the end result is a string (or byte array) which will need to be
parsed again.

If you want to allow any HTML, not limited to a certain (configurable)
subset of the HTML DTD, simply use plain NekoHTML.

Bruno Dumon                   
Outerthought - Open Source, Java & XML Competence Support Center                

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message