On Sun, 7 Nov 2004, Cliff Woolley wrote: > > > Why not introduce and option to remove the > > > comments from the served file? > > > > You have that option from mod_xmlns, mod_proxy_html or mod_publisher, > > to name but three. Along with caveats about why it's not necessarily > > a good idea. > > Ultimately I suspect that the biggest problem (assuming that the "what's a > comment and what isn't" problem is solvable) is that this kind of parsing > takes a disproportionately large amount of CPU time with respect to the > amount of network bandwidth it saves. Agreed. The reason those modules offer it is that they're already parsing the markup, so there's no additional overhead. mod_include could offer it for the same reason. OTOH, the fact that we have - and people use - mod_deflate demonstrates that there is a demand for byte count reduction. mod_deflate uses a great deal more CPU, and achieves a great deal more savings, than any of the above. > I mean really, how much bandwidth > from *html* are we talking about? Compared to images? If you start > parsing the html, you lose any ability to do zero-copy and the like, not > to mention the fact that the CPU has to examine every single byte of the > input and process and/or copy it. Yuck. I implemented that for a Client who is serving slow devices (mobile phones at 9600 baud and with a lot of latency per connection). That involved compressing both HTML and a lot of other contents, including images (the details are content-negotiated). My opinion was that the gains from manipulating HTML were not worth the extra hassle over mod_deflate, but the Client - one of the best-known names in the business - took the view that it was worthwhile. BTW, the "what is a comment" problem is easier than it looks, as both