httpd-modules-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joshua Marantz <jmara...@google.com>
Subject Help trying to figure out why an output_filter is not called.
Date Wed, 05 Jan 2011 13:45:57 GMT
One of the improvements mod_pagespeed is supposed to do to sites is extend
the cache lifetime of their resources indefinitely by including a content
hash in the URL.  This is working well for a large number of sites, but I
encountered one today where it does not work.

To accomplish the cache extension, overriding any wildcarded or
directory-based expire settings a site admin has set for their resources,
mod_pagespeed inserts two output filters.

The first one does the HTML rewriting:

  ap_register_output_filter(
      kModPagespeedFilterName, instaweb_out_filter, NULL,
AP_FTYPE_RESOURCE);

When instaweb_out_filter runs, it makes this transformation:

  before:  <script src="foo.js"></script>
  after:   <script src="foo.js.pagespeed.ce.HASH.js"></script>

The rewritten resource, foo.js.pagespeed.ce.HASH.js, is served by a hook:

  ap_hook_handler(instaweb_handler, NULL, NULL, APR_HOOK_FIRST - 1);

Knowing that mod_headers will later override Cache-Control, which we don't
want, our hook serves the .js file with our own header:

   X-Mod-Pagespeed-Repair: max-age=31536000

We a second output filter, to repair it:

  // We need our repair headers filter to run after mod_headers. The
  // mod_headers, which is the filter that is used to add the cache
settings, is
  // AP_FTYPE_CONTENT_SET. Using (AP_FTYPE_CONTENT_SET + 2) to make sure
that we
  // run after mod_headers.
  ap_register_output_filter(
      InstawebContext::kRepairHeadersFilterName, repair_caching_header,
NULL,
      static_cast<ap_filter_type>(AP_FTYPE_CONTENT_SET + 2));

This is added into the filter chain whenever we want to extend cache:

  apr_table_add(request->headers_out, "X-Mod-Pagespeed-Repair",
cache_control);
  ap_add_output_filter("X-Mod-Pagespeed-Repair",
                       NULL, request, request->connection);

When working properly, this header is removed from request->headers_out by
repair_caching_header():

  const char* cache_control = apr_table_get(request->headers_out,
                                            "X-Mod-Pagespeed-Repair");
  if (cache_control != NULL) {
    SetCacheControl(cache_control, request);
    apr_table_unset(request->headers_out, kRepairCachingHeader);
  }

Where SetCacheControl also makes the Expires header consistent, etc.

While this approach is complex, I've never seen it fail until today, on the
site http://law.thu.edu.tw/main.php . On that site, the
"X-Mod-Pagespeed-Repair"
header is visible (it should have been removed) and the Cache-Control header
has the value set from the conf files (public, max-age=600).   So on this
server, the repair_caching_header filter is not being run, despite having
been programatically inserted by our code in the same place where we add "
X-Mod-Pagespeed-Repair" header.

What might be going wrong in his server to cause this to fail?  Could some
other filter be somehow finding our filter and killing it?  Or sending the
bytes directly to the network before our filter has a chance to run?

Thanks!
-Josh

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message