cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicola Ken Barozzi <nicola...@apache.org>
Subject Re: Handling lousy HTML
Date Fri, 06 Sep 2002 13:05:54 GMT

Ola Berg wrote:
> From: "John Moylan" <john@rte.ie>
> 
>>You probably need to preprocess your HTML with tidy before you introduce 
>>it to Cocoon.
> 
> 
> Well, according to the sitemap in the cocoon dist (2.0.2), jtidy is involved in the HTML
generator.

Yes, correct.

> Yes, preprocessing is a necessity. But I need it preprocessed live and direct in the
pipeline by Cocoon, as the bad HTML is generated by legacy scripts that no one dares to touch,
just wrap using Cocoon.
> 
> Either way: a HeavyDutyMrProperHtmlGenerator that fixes this using some heavy tidy-stuff
should be useful. I understand if the normal HTMLGenerator don't want to waste cycles on handling
"HTML" that never should have been written anyway, but if you _know_ you have to deal with
pages generated by FrontPage0.6 or perl scripts done by interns in the summer of '96, I think
the option should be available. 
> 
> Does such a beast exist somewhere?

HTMLGenerator uses JTidy directly, without making assumptions itself.
If you can use JTidy to work for you, it should work - or can be easily 
made to work - with HTMLGenerator too.

> If not, I intend to write one, as the problem at our company needs to be solved about
this yesterday :-)

Look here, maybe it's the right time to ditch tidy entirely

http://www.apache.org/~andyc/neko/doc/html/index.html

> BTW: the example I provided is actually cleaner than much of the code I need Cocoon to
deal with.

:-O

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


---------------------------------------------------------------------
Please check that your question  has not already been answered in the
FAQ before posting.     <http://xml.apache.org/cocoon/faq/index.html>

To unsubscribe, e-mail:     <cocoon-users-unsubscribe@xml.apache.org>
For additional commands, e-mail:   <cocoon-users-help@xml.apache.org>


Mime
View raw message