Return-Path: Delivered-To: apmail-httpd-dev-archive@www.apache.org Received: (qmail 61644 invoked from network); 5 Mar 2005 17:37:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 5 Mar 2005 17:37:48 -0000 Received: (qmail 36387 invoked by uid 500); 5 Mar 2005 17:37:36 -0000 Delivered-To: apmail-httpd-dev-archive@httpd.apache.org Received: (qmail 36311 invoked by uid 500); 5 Mar 2005 17:37:36 -0000 Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm Precedence: bulk Reply-To: dev@httpd.apache.org list-help: list-unsubscribe: List-Post: Delivered-To: mailing list dev@httpd.apache.org Received: (qmail 36297 invoked by uid 99); 5 Mar 2005 17:37:36 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (hermes.apache.org: local policy) Received: from scotch.ics.uci.edu (HELO scotch.ics.uci.edu) (128.195.24.168) by apache.org (qpsmtpd/0.28) with ESMTP; Sat, 05 Mar 2005 09:37:34 -0800 Received: from localhost (scotch.ics.uci.edu [128.195.24.168]) (authenticated bits=0) by scotch.ics.uci.edu (8.12.11/8.12.11) with ESMTP id j25HbTLF028586 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 5 Mar 2005 09:37:30 -0800 (PST) Date: Sat, 05 Mar 2005 09:37:32 -0800 From: Justin Erenkrantz To: dev@httpd.apache.org Subject: Re: mod_cache Message-ID: <699D9AC07A22C294AEC9BB63@[10.0.1.44]> In-Reply-To: <4228E750.7050403@apache.org> References: <4228E750.7050403@apache.org> X-Mailer: Mulberry/3.1.5 (Mac OS X) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Spam-Status: No, score=-2.8 required=5.0 tests=ALL_TRUSTED autolearn=failed version=3.1.0-r105955 X-Spam-Checker-Version: SpamAssassin 3.1.0-r105955 (2004-11-20) on scotch.ics.uci.edu X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on scotch.ics.uci.edu X-Virus-Status: Clean X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N --On Friday, March 4, 2005 11:55 PM +0100 Sander Striker wrote: > -- > modules/cache/mod_cache.c:271 > > /* If the request has Cache-Control: no-store from RFC 2616, don't store > * unless CacheStoreNoStore is active. > */ > cc_in = apr_table_get(r->headers_in, "Cache-Control"); > if (r->no_cache || > (!conf->store_nostore && > ap_cache_liststr(NULL, cc_in, "no-store", NULL))) { > ap_remove_output_filter(f); > return ap_pass_brigade(f->next, in); > } > > What happens if the 'Cache-Control: no-store' header came in with a > 304 Not Modified and the original request wasn't conditional? > If I read the spec correctly a 304 can carry a Cache-Control header, > if it has a different value since a previous 200 (or 304). Hmm. Good point. What a goofy corner case though. I won't lose much sleep if we don't fix this. Care to add a FIXME? =) > -- > modules/cache/mod_cache.c:308 > > /* have we already run the cachability check and set up the > * cached file handle? */ > if (cache->in_checked) { > /* pass the brigades into the cache, then pass them > * up the filter stack > */ > > I haven't tracked cache->in_checked fully, so I wonder if it is > possible that this is set on a validating request? That would > cause the cache not being updated, which is what I am trying to > track down FWIW. This is not 'my' bug though, since I am seeing > the following line in the log: > > [debug] mod_disk_cache.c(616): disk_cache: Stored headers for URL xxx > > However the cache files on disk don't change... I'm a bit puzzled why > not from looking at the code. Yes, it would be set. in_checked is set after we know that we want to save the response (i.e. all of the cacheability checks passed successfully). However, note that if we are 'blocking' the response (i.e. we revalidated the response succesfully with a 304), we set block_response which is checked right before in_checked. (I'll come back to your mod_disk_cache problem at the end.) > -- > modules/cache/mod_cache.c:371 > > /* > * what responses should we not cache? > * > * At this point we decide based on the response headers whether it > * is appropriate _NOT_ to cache the data from the server. There are > * a whole lot of conditions that prevent us from caching this data. > * They are tested here one by one to be clear and unambiguous. */ > if (r->status != HTTP_OK && r->status != HTTP_NON_AUTHORITATIVE > && r->status != HTTP_MULTIPLE_CHOICES > && r->status != HTTP_MOVED_PERMANENTLY > && r->status != HTTP_NOT_MODIFIED) { > /* RFC2616 13.4 we are allowed to cache 200, 203, 206, 300, 301 or > 410 > * We don't cache 206, because we don't (yet) cache partial > responses. > * We include 304 Not Modified here too as this is the origin server > * telling us to serve the cached copy. > */ > reason = apr_psprintf(p, "Response status %d", r->status); > } > AIUI, we can cache "302 Found" (HTTP_MOVED_TEMPORARILY) when it has an > Expires > or Cache-Control indicating that the request can be cached. Fair enough. Feel free to add it, if you like. > -- > modules/cache/mod_cache.c:685 > > /* Did we just update the cached headers on a revalidated response? > * > * If so, we can now decide what to serve to the client: > * - If the original request was conditional and is satisified, send 304. > * - Otherwise, send the cached body. > */ > if (rv == APR_SUCCESS && cache->stale_handle) { > apr_bucket_brigade *bb; > apr_bucket *bkt; > > bb = apr_brigade_create(r->pool, r->connection->bucket_alloc); > > /* Were we initially a conditional request? */ > if (ap_cache_request_is_conditional(cache->stale_headers)) { > /* FIXME: We must ensure that the request meets conditions. */ > > /* Set the status to be a 304. */ > r->status = HTTP_NOT_MODIFIED; > > Is this as simple as clearing r->headers_in, overwriting with > cache->stale_headers, > and the calling ap_meets_conditions()? Yes, I *believe* so. But, we should double-check that ap_meets_condition would do the 'right' thing. I'm not 100% sure here. Okay, back to your real bug: > possible that this is set on a validating request? That would > cause the cache not being updated, which is what I am trying to > track down FWIW. This is not 'my' bug though, since I am seeing > the following line in the log: > > [debug] mod_disk_cache.c(616): disk_cache: Stored headers for URL xxx > > However the cache files on disk don't change... I'm a bit puzzled why > not from looking at the code. Please correct me if I'm wrong. My understanding is that you have an expired cache entry which needs to be revalidated and the updated headers aren't updating properly. Now that I read the code I'm betting you are hitting that else block with 'XXX log message' in mod_disk_cache's store_headers. The sequence that I'm envisioning is: - We have a stale cached entity/handle. - We send an If-Modified-Since/etc request - We get the 304 Not modified on revalidation with an updated Expires header. - At mod_cache.c:533, we restore the cached handle. - Around mod_cache.c:658, we merge in the updated response headers - Then at mod_cache.c:683, we call the provider to store the headers. - Then, we hit store_headers in mod_disk_cache. - Because mod_disk_cache *had* a stale response, it would have loaded the stale headers into its handle. - Hence, the check at mod_disk_cache.c:532 will fail because hfd is 'valid.' - We never update the cached headers with the new Expires header Can you confirm this sequence? I'm thinking we shouldn't even check for the !hfd case anyway. That check has probably been there forever and was likely to be completely bogus from the start. What do you think? -- justin