httpd-docs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael.Schro...@telekurs.de
Subject Antwort: Re: [Review] mod_deflate.xml (Revision) (was: [Review] mod_deflate.xml)
Date Tue, 19 Nov 2002 02:24:28 GMT

Hi Andy,


>  put the whole revision online again at
> <http://cvs.apache.org/~nd/manual/mod/mod_deflate.html>

> Some popular browsers cannot handle compression of all content

you might want to claim safe some browser here, like there are
browser lists in the mod_auth_digest documentation:
- Mozilla from high 0.9.x numbers (definitely all 1.x versions),
- Netscape 6.2 and up,
- Opera 5 and up,
- Internet Explorer 5 and up (maybe even 4, I don't have such a
  box here, only in "compatibility mode" ... there it works).
Or you could explicitly blame Netscape 4 for doing what it does,
and claim everything that came later "most likely safe".
This should be a good approximation for the reader about what to
expect: After 5 years the browsers finally got it right.

> If you want to restrict the compression to particular MIME types in
general,
> you may use the AddOutputFilterByType directive.

You might add a note about this being a valid alternative to
excluding Netscape 4.
Actually, I was using this in a production environment with a
lot of Netscape 4 customers (before I decided to let all those
CSS and JS files rather be included server-sided, as my pages
are mostly dynamic anyway and this did even save lots of HTTP
headers, which aren't compressed after all).
And did you ever try to send compressed PDF to a MSIE5 Acrobat
plugin? Now this depends ... ;-(

> Note, that the Microsoft Internet Explorer

I think this comma had rather be removed in an English version.

> Now if a request contains a Content-Encoding: gzip  header

I am rather indecisive about some appropriate location but I
think about somewhere telling the reader how and why this ac-
tually does happen, especially for the IE.
Surely, this is none of Apache's core business, but one might
be disappointed not to serve compressed content to a lot of
browsers just because these Internet Explorers ship with a
default setting that will decline them to send the proper HTTP
header when effectively using a proxy server, which can be as-
sumed to be the case for most installations on office systems.

Or maybe just a note how to check whether this header actually
arrived from the browser ... I remember the log definitions
provide for HTTP request headers being accessible like
"%{Accept-Encoding}i", at least this worked for Apache 1.3. ;-)

> +    <p>A HTTP compiliant proxy now delivers the cached data

compiliant -> compliant.
    ^^
(There is a second occurrence of this, somewhere later in the file.)

> to any client, which sends the

Rather get rid of both commas in this sentence again, and make
the second "which" a "that", maybe?

> +    the client, which did the initially request that was cached.</p>

initially -> initial. (You aren't refering to a verb but to a
       ^^^^             substantive, which makes the difference,
                      at least I believe ... ;-)
> thus you have to tell him, what you're doing.

Looks like another comma to be removed.

> +    <p>Fine. But what happens, if you use some special exclusions
dependant

dependant -> depending (not quite sure about this one)
      ^^^^^^
> You have to use the mod_headers  module to add appropriate values to the
Vary
> header, for example:

Actually, the real problem starts when you use the logic of
how to set the mod_deflate environment variables conditionally
on the content of the "UserAgent" field, because you will now
have (or at least want) to set the "Vary:" header conditional-
ly as well (sending more "Vary:" than appropriate may cause a
severe performance penalty for the proxies, see below).
How about combining these two things to one larger, "real-life"
example? (Only if there is any easy way to handle this at all,
of course.)

Anyway, things can easily become messy if the conditions would
become a little more complicated.
You may warn the user about that, as it may require using
"Header append" to dynamically compose the "Vary:" header's
content.

> +    <p>That would result in the following response header:</p>
> +    <example>
> +      Vary: Accept-Encoding,User-Agent
> +    </example>
You might add a note here that in this special case this would
effectively disable caching for any proxy server, because the
exact identity of all those UserAgent strings would require them
to keep thousands of versions of the content.
Actually, using the UserAgent as a negotiation parameter is a
thing that I discourage doing (in the mod_gzip docs) because of
this effect. You are faced with the trade-off whom to let suffer:
Either serve the Netscape 4 users broken content or serve every-
one correct but slow content. One should be aware of this situ-
ation when using "UserAgent" as a negotiation dimension - there
will be one day when the Netscape 4 users will be only a marginal
minority (now that Netscape 7 is finally available even for office
installations that won't dare to install Mozilla). So plan ahead
for the future.

By the way, Squid 2.5 is the first proxy I have heard of (and
evaluated myself successfully) that is able to understand all
that and actually correctly treat this negotiation stuff correct-
ly while still caching the content. (It will even handle the
"UserAgent" problem, to a certain degree at least.)
Squid 2.0 to 2.4 will take _any_ "Vary:" header to simply turn
off caching for this response - this is correct but quite a pity
in case of performance. As long as there are not many Squid 2.5
around, the main requirement is sending any "Vary:" at all; the
more widely used Squid 2.5 will be, the more important the
"Vary:" content will become.

But as there are other, broken proxies out there, causing a lot
of trouble when dealing with gzipped content, you might hint at
the "Via:" header to be another potential negotiation parameter
that is worth being taken into consideration, once you identified
the "broken guy". (Does Apache provide an easy means for that?)

+    <p>If your decision about compression depends on other information
+    than request headers (<em>e.g.</em> HTTP version), you have to set the
+    <code>Vary</code> header to the value <code>*</code>. This prevents
+    documents from caching by HTTP compiliant proxies at all.</p>

> The DeflateBufferSize directive specifies the size in bytes of the
fragments
> that zlib should compress at one time.

Seems to be a performance tuning issue, not changing the effective
content output, right?  You might tell this explicitly, possibly.
(Any reasonable interval for values? How often will a buffer of
this size reside in memory, given some average Apache configura-
tion? Once per child process? Depends on MPM?)

> DeflateFilterNote Directive

<feature_request>
I never liked the idea of serving some derived value here instead
of the actual sizes pre and post compression. mod_gzip did that
wrong as well, and in fact even rounded falsely (made 94.01% be
shown as "95%", as this might sell better, ouch :-\).
I would prefer having notes on the file sizes as well. And then,
what will be the value of the "ratio" note in case mod_deflate
chose not to compress this request? Can I rely on finding "0%"
there, as to write some compression effect evaluation tool based
on this assumption? (This note will provide me with the informa-
tion I need, unless some insufficiently precise "ratio" value
prevents me from inverting its calculation by multiplying it with
the %b content of the log format ... there may well be some mgzta
clone for mod_deflate one day as well ...)
</feature_request>

> DeflateMemLevel Directive
> Description: How much memory should be used by zlib for compression

Is this the comandline parameter that is available for the UNIX
'gzip' command ("gzip -9 filename")?
If so, the gains of using more than 6 seem to be _very_ small
(below the 1% rate in many cases), thus you might suggest to use
something in the 3 (reasonable minimum) to 6 range for installa-
tions that would like to save CPU power - they won't lose a lot
of the effect by doing so.
If you had the ability to cache the compressed content (like
gzip_cnc does, or when combining an Apache and some front-end
proxy Squid 2.5), _then_ 9 would be the perfect choice, but not
if each and every content has to be compressed again. Run bench-
marks on this, it may turn out quite expensive for high-traffic
servers.


Viele Grüße

      Michael



---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org


Mime
View raw message