httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Kew <n...@webthing.com>
Subject Re: [users@httpd] Stripping white space from HTML
Date Sat, 10 Mar 2007 09:54:08 GMT
On Fri, 9 Mar 2007 17:42:52 -0800
"Mark Lavi" <mlavi@sgi.com> wrote:

> The best answer is to correct things at the source in your shopping
> cart, file a bug there!
> 
> But in Apache2 you have other potential answers:
> 
> Try http://mod-tidy.sourceforge.net/ and learn about it's parent
> project: Tidy at: http://tidy.sourceforge.net/

There are serious issues performance issues with tidy.  It has no
streaming mode, and parses everything to an in-memory tree, so
it's inherently not scalable and breaks Apache's pipelining.

> ...and you'll get XHTML compliance as well

Tidy itself makes no such claim.  By contrast, mod_tidy did make 
some such bogus claims last time I looked.  The bottom line was
that the developers of the latter appear ill-informed on the
meaning of (X)HTML compliance.  Their code was also rather alarming.

> However, you'll also incur a performance hit on delivering pages,

Yep.

You'd incur a far lower penalty using a SAX-based parser such as
mod_proxy_html or mod_publisher.

> may not work with your setup easily, and it may also break the way
> some HTML renders in browsers.

I expect that's the same issue as described in Question 3 of the
mod_proxy_html FAQ (namely, severely broken HTML).  Lots of
whitespace doesn't mean it's broken, FWIW!

If the issue is just one of transmitting far too many bytes, then
standard compression with mod_deflate will fix that.  That's also
a performance hit, so you might want to use mod_cache.

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Mime
View raw message