httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Holsman <i...@apache.org>
Subject Re: Content Codings (was: Re: cvs commit: httpd-2.0/modules/filters mod_deflate.c)
Date Sun, 26 May 2002 04:22:32 GMT
This is a patch to implement what you mentioned Greg.
I haven't got the time to fully test so hence it's in a email
instead of just being committed.
--Ian

Index: mod_deflate.c
===================================================================
RCS file: /home/cvs/httpd-2.0/modules/filters/mod_deflate.c,v
retrieving revision 1.6
diff -u -r1.6 mod_deflate.c
--- mod_deflate.c	20 May 2002 00:07:33 -0000	1.6
+++ mod_deflate.c	26 May 2002 04:21:00 -0000
@@ -232,6 +232,7 @@
  {
      apr_bucket *e;
      const char *accepts;
+    const char *contentencoding;
      request_rec *r = f->r;
      deflate_ctx *ctx = f->ctx;
      char *token = NULL;
@@ -273,8 +274,9 @@
              return ap_pass_brigade(f->next, bb);
          }

-        /* content is already encoded, so don't encode it again */
-        if (apr_table_get(r->headers_in, "Content-Encoding")) {
+        /* content is already gzip encoded, so don't encode it again */
+        contentencoding = apr_table_get(r->headers_in, "Content-Encoding");
+        if (contentencoding && ap_strstr(contentencoding,"gzip")) {
              ap_remove_output_filter(f);
              return ap_pass_brigade(f->next, bb);			
          }
@@ -303,12 +305,7 @@
          ctx = f->ctx = apr_pcalloc(r->pool, sizeof(*ctx));
          ctx->bb = apr_brigade_create(r->pool, f->c->bucket_alloc);
          ctx->buffer = apr_palloc(r->pool, c->bufferSize);
-/*
-        ctx->stream.zalloc = (alloc_func) 0;
-        ctx->stream.zfree = (free_func) 0;
-        ctx->stream.opaque = (voidpf) 0;
-        ctx->crc = 0L;
-*/
+
          zRC = deflateInit2(&ctx->stream, Z_BEST_SPEED, Z_DEFLATED,
                             c->windowSize, c->memlevel,
                             Z_DEFAULT_STRATEGY);
@@ -328,7 +325,16 @@
          e = apr_bucket_pool_create(buf, 10, r->pool, f->c->bucket_alloc);
          APR_BRIGADE_INSERT_TAIL(ctx->bb, e);

-        apr_table_setn(r->headers_out, "Content-Encoding", "gzip");
+        /*
+         * if it's not there, or is set to 'identity' just hard code it 
to gzip
+         * otherwise append 'gzip' to the other ones already there
+         */
+        if (contentencoding && strcmp(contentencoding,"identity")) {
+            apr_table_setn(r->headers_out, "Content-Encoding", 
apr_pstrcat(r->pool,contentencoding,", gzip", NULL));
+        }
+        else {
+            apr_table_setn(r->headers_out, "Content-Encoding", "gzip");
+        }
          apr_table_setn(r->headers_out, "Vary", "Accept-Encoding");
          apr_table_unset(r->headers_out, "Content-Length");
      }


Greg Stein wrote:
> On Mon, May 20, 2002 at 08:40:03AM -0700, Ian Holsman wrote:
> 
>>Cliff Woolley wrote:
>>
>>>On 20 May 2002 ianh@apache.org wrote:
>>>
>>>
>>>
>>>>ianh        02/05/19 17:07:33
>>>>
>>>> Modified:    modules/filters mod_deflate.c
>>>> Log:
>>>> content with "Content-Encoding" header, content is encoded.
>>>> But mod_deflate does not check it. It cause to encode content twice.
>>>>
>>>> This problem is reproducable by getting encoded content via mod_proxy.
>>>>
>>>> Patch Contributed by kaz@asada.sytes.net (ASADA Kazuhisa)
>>>> Bug #9222
>>>
> 
> This is the wrong fix. Take a look at RFC 2616. MULTIPLE encodings are
> allowed. If the Content-Encoding header said "identity", then we'll skip the
> compression. Bad Bad Bad.
> 
> The following is entirely legal:
> 
> Content-Encoding: gzip, compress, deflate
> 
> (dumb, but legal)
> 
> [ see sections 14.11 and 3.5 of RFC 2616 for more info ]
> 
> 
>>>Whoa, wait a minute.  That doesn't strike me as the right solution.  The
>>>encoding should be one-hop only.  If it's encoded and we want to maintain
>>>that encoding, chances are we'll have to decode it and re-encode it later.
>>
> 
> Depends on what we want to do with proxied requests. Do we run them through
> our own filter stack? (uncompressed) But also, it is entirely reasonable
> that we could build an inflator on the client-piece of our proxy to handle
> the case where Apache can speak compressed to the origin server, but we have
> to inflate it for our client.
> 
> (right now, the proxy is just copying over what the client allows; it would
>  be entirely reasonable to tell the origin server we can always accept
>  deflated content; of course, that also means we need to write the inflate
>  logic :-)
> 
> 
>>deflate runs at the end (I think it's a header/network type output 
>>filter)  and the output of this is compressed data.
>>
>>we do check if the user can accept the gzip encoding,
>>i assumed that if this has been content-encoded already that module has 
>>done all the checks necessary to make sure the client supports it.
> 
> 
> It happens to work because the proxy client stands in for the true client.
> But that isn't always a good assumption.
> 
> 
>>This should also fix the case where you have multiple content-encoders 
>>running on the same file.. the one with the highest priority (first in 
>>the chain) will do the encoding and the others will leave it alone.
> 
> 
> Absolutely wrong. Each of them can and should be able to apply themselves if
> they so choose.
> 
> The correct test is something like this:
> 
>   if Content-Encoding is present:
>       if Content-Encoding contains "deflate" then
>           skip-self
> 
> Essentially, we will apply a "deflate" encoding if it isn't already present.
> After doing the deflate, we then need to set the Content-Encoding using:
> 
>   if Content-Encoding is not present, or it equals "identity":
>       set Content-Encoding to "deflate"
>   else
>       append ", deflate" to Content-Encoding
> 
> 
>>...
>>
>>>Why you ask?  Because leaving it encoded makes it impossible to apply
>>>another filter on the proxy server (eg mod_include).  Now perhaps if we
>>>can guarantee that there are no other filters in the chain that will want
>>>to modify the content *and* that the client can actually accept the
>>>encoding, then as an optimization we can pass the data through the filter
>>>chain still encoded.  But that would only be an optimization.  And it
>>
> 
> Tough optimization. Right now, we don't have these kinds of "whole-chain"
> introspection concepts. Really, what we'd want is some kind of flag for the
> filters to say "I want uncompressed content". When the proxy wants to
> deliver compressed content, but it sees that filter in the stack, then it
> will decompress it.
> 
> (or the filter code itself would examine the stack and insert the right
>  translators between filters to ensure their inputs and ouputs are matched
>  up with the right encodings, character sets, etc)
> 
> 
>>>seems like it could be tricky to get it to always work doing it that way,
>>>perhaps.  (Is there ever a case where the client does not accept an
>>>encoding but the proxy does?)
>>
> 
> We could certainly implement the proxy that way.
> 
> Cheers,
> -g
> 




Mime
View raw message