httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Prud'hommeaux" <e...@w3.org>
Subject Re: [PATCH] making mod_disk_cache work like mod_mem_cache
Date Fri, 02 Aug 2002 07:02:49 GMT
On Thu, Aug 01, 2002 at 07:55:32PM -0400, Paul J. Reder wrote:
> This only addresses part of the headers. You aren't handling last_
> modified header value, the content_type or the filename (which are
> all handled in the mem_cache).

I believe the special elements in the request structure are filled by
ap_scan_script_header_err_core's copious special header handing
(server/util_script.c:438).

  Content-type	-> r->content_type && set handler
  Status	-> r->status and r->status_line -- not relevent outside script
  Location	-> r->headers_out
  Content-Length-> r->headers_out
  Transfer-Encoding -> r->headers_out
  Last-Modified	-> r->mtime and r->headers_out
  Set-Cookie	-> overlayed on r->err_headers_out
  -others-	-> r->err_headers_out

Merging these seems to work out well (included in the attached diff):

    ap_scan_script_header_err(r, dobj->hfd, NULL);	/* read headers into err_headers_out
*/
    apr_table_overlap(r->headers_out, r->err_headers_out,
		      APR_OVERLAP_TABLES_MERGE);/* these are the real headers, not error headers */
    r->err_headers_out = apr_table_make(r->pool, 20);
    rv = apr_file_gets(&urlbuff[0], urllen, dobj->hfd);           /* Read status  */

Now ideally, none of the headers above will come in a GET request. We
can therefor call ap_scan_script_header_err and it will not corrupt/
contribute to the already set r->headers_out.

    rv = apr_file_gets(&urlbuff[0], urllen, dobj->hfd);   /* Skip empty line. */
    r->err_headers_out = h->req_hdrs = apr_table_make(r->pool, 20);/* Pick up */
    ap_scan_script_header_err(r, dobj->hfd, NULL);    /* request headers also */
    r->err_headers_out = apr_table_make(r->pool, 20);    /* in metadata file. */


If we are less confident of this, and elect to protect r->headers_out,
we can use a tempTable to hold the contents:

    apr_table_t * tempTable;
    ....
    rv = apr_file_gets(&urlbuff[0], urllen, dobj->hfd);   /* skip empty line  */
    tempTable = r->headers_out;
    r->headers_out = h->req_hdrs = apr_table_make(r->pool, 20);    /* pick up */
    ap_scan_script_header_err(r, dobj->hfd, NULL);    /* request headers also */
    r->headers_out = tempTable;				 /* in metadata file. */
    apr_table_overlap(h->req_hdrs, r->err_headers_out,
		      APR_OVERLAP_TABLES_MERGE);
    r->err_headers_out = apr_table_make(r->pool, 20);

I vote the former way. We are supposed to be dealing with compliant
agents. Downside, a bogus agent could, by sending a Content-Type: in a
GET request, override the response for everyone until the cache got
stale.

> You also cut and pasted the offset value in your code.
> 
> +    offset += (sizeof(info->expire)*2) + 1;
> +    info->request_time = ap_cache_hex2usec(urlbuff + offset);
> +    offset += (sizeof(info->expire)*2) + 1;
> +    info->response_time = ap_cache_hex2usec(urlbuff + offset);
> 
> should at a minimum be
> 
> +    offset += (sizeof(dobj->version)*2) + 1;
> +    info->request_time = ap_cache_hex2usec(urlbuff + offset);
> +    offset += (sizeof(info->request_time)*2) + 1;
> +    info->response_time = ap_cache_hex2usec(urlbuff + offset);
> 
> I know this is pedantic, but the values might change.

Yeah, I blindly copied that stuff. Thanks for noticing.

> I am in fact working on this very thing. I will add the extra code
> you have done to my patch if that is okay with you. Otherwise I can
> just do a more detailed look at your code with comments and commit
> your patch for you.
> 
> Let me know.

Seems like adopting it is the easiest for you. Either is fine by me.

> By the way, how tested is your code?

The patches I sent you represent a subset of the patches I've
introduced to handle HTTP Extensions (I'll submit them later). I
have been running that code for a week or so now, specifically
testing different scenarios.

As for the patches I sent you, I thought it was tested., but now I
think I was testing mem. Duh.

Anyways, current status:
Three code paths:

====== new, uncached resource -- established by `rm -r /usr/local/w3c-apache-clean/proxy/*`

GET http://mr-pink.w3.org:8081/robots.txt HTTP/1.1
Host: mr-pink.w3.org:8081
Foo: bar

HTTP/1.1 200 OK
Date: Fri, 02 Aug 2002 06:54:18 GMT
Server: Apache/1.3.26 (Unix) PHP/3.0.18
P3P: policyref="http://www.w3.org/2001/05/P3P/p3p.xml"
Cache-Control: max-age=21600
Expires: Fri, 02 Aug 2002 12:54:18 GMT
Last-Modified: Thu, 06 Jun 2002 09:44:28 GMT
ETag: "3cff2efc"
Accept-Ranges: bytes
Content-Type: text/plain; qs=0.4; charset=ISO-8859-1
Via: 1.1 mr-pink.w3.org:8081
Content-Length: 473

#
# robots.txt for http://www.w3.org/
...


===== served from valid cache

GET http://mr-pink.w3.org:8081/robots.txt HTTP/1.1
Host: mr-pink.w3.org:8081
Foo: bar

HTTP/1.1 200 OK
Date: Fri, 02 Aug 2002 06:55:53 GMT
Server: Apache/1.3.26 (Unix) PHP/3.0.18
Last-Modified: Thu, 06 Jun 2002 09:44:28 GMT
P3P: policyref="http://www.w3.org/2001/05/P3P/p3p.xml"
Cache-Control: max-age=21600
Expires: Fri, 02 Aug 2002 12:54:18 GMT
ETag: "3cff2efc"
Accept-Ranges: bytes
Via: 1.1 mr-pink.w3.org:8081
Content-Type: text/plain; qs=0.4; charset=ISO-8859-1
Age: 98403064
Content-Length: 473

#
# robots.txt for http://www.w3.org/


===== stale cache not used -- simulated with forcing
  ap_cache_check_freshness to return 0. I've also tested the date
  parsing stuff by setting up a 10 second expiry and seeing it "do the
  right thing".

GET http://mr-pink.w3.org:8081/robots.txt HTTP/1.1
Host: mr-pink.w3.org:8081
Foo: bar

HTTP/1.1 200 OK
Date: Fri, 02 Aug 2002 06:55:53 GMT
Server: Apache/1.3.26 (Unix) PHP/3.0.18
Last-Modified: Thu, 06 Jun 2002 09:44:28 GMT
P3P: policyref="http://www.w3.org/2001/05/P3P/p3p.xml"
Cache-Control: max-age=21600
Expires: Fri, 02 Aug 2002 12:54:18 GMT
ETag: "3cff2efc"
Accept-Ranges: bytes
Via: 1.1 mr-pink.w3.org:8081
Content-Type: text/plain; qs=0.4; charset=ISO-8859-1
Age: 98403064
Content-Length: 473

#
# robots.txt for http://www.w3.org/
...


===== end of samples

> Thanks,

Cheers,
-- 
-eric

> Eric Prud'hommeaux wrote:
> 
> > On Thu, Aug 01, 2002 at 08:52:49AM -0400, Bill Stoddard wrote:
> > 
> >>mod_mem_cache fomr HEAD should work. mod_disk_cache is still broken.
> >>
> >>Try a config like this:
> >>
> >>LoadModule cache_module modules/mod_cache.so
> >><IfModule mod_cache.c>
> >>   CacheOn On
> >>#  CacheMaxExpire 2
> >>#  CacheDefaultExpire 1
> >>   LoadModule mem_cache_module modules/mod_mem_cache.so
> >>   <IfModule mod_mem_cache.c>
> >>      CacheEnable mem /
> >>      MCacheSize 4096
> >>      MCacheMaxObjectCount 100
> >>      MCacheMinObjectSize 1
> >>      MCacheMaxObjectSize 1000000
> >>#      CacheDefaultExpire 1
> >>   </IfModule>
> >></IfModule>
> >>
> > 
> > Sweet -- that gave me a template to work with to make mod_disk_cache
> > work. I made disk_cache store the request headers so the Vary: parser
> > would have something to work with. I also through in the rest of the
> > times needed to calculate the expiration.
> > 
> > Also attached is the cache_storage.patch described earlier. Perhaps
> > someone up on mem-cache will be able to see if I should push this
> > functionality into mod_disk_cache.
> > 
> > 
> >>>-----Original Message-----
> >>>From: Eric Prud'hommeaux [mailto:eric@w3.org]
> >>>Sent: Thursday, August 01, 2002 1:06 AM
> >>>To: Apache HTTP server developers
> >>>Subject: apache 2 disk-cache SEGV
> >>>
> >>>
> >>>Has anybody seen --enable-disk-cache or --enable-mem-cache work under
> >>>apache 2? In my scenario:
> >>> --enable-maintainer-mode --with-mpm=prefork --enable-rewrite
> >>> --enable-expires --enable-speling --disable-auth --enable-headers
> >>> --enable-info --disable-userdir --enable-dav --enable-proxy
> >>> --enable-proxy-connect --enable-proxy-ftp --enable-proxy-http
> >>> --enable-file-cache --enable-cache --enable-disk-cache
> >>> --enable-mem-cache --no-create --no-recursion
> >>> --prefix=/usr/local/apache-2-clean
> >>>the cache context's filename is copied to r->filename
> >>>  cache_url_handler (modules/experimental/mod_cache.c:192)
> >>>  cache_read_entity_headers (modules/experimental/mod_cache.c:192)
> >>>  cache_select_url which (modules/experimental/mod_cache.c:287)
> >>>    r->filename = apr_pstrdup(r->pool, info->filename );
> >>>before the content_set filter sets it.
> >>>  cache_in_filter (modules/experimental/mod_cache.c:732)
> >>>    info->filename = apr_pstrdup(r->pool, r->filename );
> >>>
> >>>Can someone who has this working set breakpoints at cache_url_handler
> >>>and cache_in_filter and let me know if they see the same behavior?
> >>>The two are registered with
> >>>    ap_hook_quick_handler(cache_url_handler, NULL, NULL, APR_HOOK_FIRST);
> >>>and
> >>>    ap_register_output_filter("CACHE_IN",
> >>>                              cache_in_filter,
> >>>                              NULL,
> >>>                              AP_FTYPE_CONTENT_SET);
> >>>"CACHE_IN" is added to the request several places in cache_url_handler (a
> >>>bit too late) and in cache_conditional_filter, which is also added to the
> >>>request in cache_url_handler. From this it appears that the info->filename
> >>>will _always_ be copied to r->filename before it has been initialized.
> >>>
> >>>If you are replicating this problem, you will probably need to created
> >>>the disk cache directory /usr/local/apache-2-clean/proxy (in this case).
> >>>Otherwise, the cache is never written and so never read and so the code
> >>>path never arises.
> >>>
> >>>Since this is a cache re-use issue, you'll have to GET something twice
> >>>through the proxy in rapid enough succession that the cached copy
> >>>doesn't go stale.
> >>>
> >>>For now, I just cheesed around it with
> >>>+++ modules/experimental/mod_cache.c:287
> >>>+    if (info->filename)
> >>> 	r->filename = apr_pstrdup(r->pool, info->filename );
> >>>but never having seen it work, I don't know what the intended path is.
> >>>--
> >>>-eric
> >>>
> >>>(eric@w3.org)
> >>>Feel free to forward this message to any list for any purpose other than
> >>>email address distribution.
> >>>
> >>>
> > 
> > 
> > ------------------------------------------------------------------------
> > 
> > Index: httpd-2.0/modules/experimental/mod_disk_cache.c
> > ===================================================================
> > RCS file: /home/cvspublic/httpd-2.0/modules/experimental/mod_disk_cache.c,v
> > retrieving revision 1.36
> > diff -u -r1.36 mod_disk_cache.c
> > --- httpd-2.0/modules/experimental/mod_disk_cache.c	17 Jul 2002 14:52:36 -0000	1.36
> > +++ httpd-2.0/modules/experimental/mod_disk_cache.c	1 Aug 2002 21:50:11 -0000
> > @@ -237,7 +237,7 @@
> >      if ((temp = strchr(&urlbuff[0], '\n')) != NULL) /* trim off new line character
*/
> >          *temp = '\0';      /* overlay it with the null terminator */
> >  
> > -    if (!apr_date_checkmask(urlbuff, "&&&&&&&&&&&&&&&&
&&&&&&&&&&&&&&&& &&&&&&&&&&&&&&&&"))
{
> > +    if (!apr_date_checkmask(urlbuff, "&&&&&&&&&&&&&&&&
&&&&&&&&&&&&&&&& &&&&&&&&&&&&&&&&
&&&&&&&&&&&&&&&& &&&&&&&&&&&&&&&&"))
{
> >          return APR_EGENERAL;
> >      }
> >  
> > @@ -246,6 +246,10 @@
> >      info->expire = ap_cache_hex2usec(urlbuff + offset);
> >      offset += (sizeof(info->expire)*2) + 1;
> >      dobj->version = ap_cache_hex2usec(urlbuff + offset);
> > +    offset += (sizeof(info->expire)*2) + 1;
> > +    info->request_time = ap_cache_hex2usec(urlbuff + offset);
> > +    offset += (sizeof(info->expire)*2) + 1;
> > +    info->response_time = ap_cache_hex2usec(urlbuff + offset);
> >      
> >      /* check that we have the same URL */
> >      rv = apr_file_gets(&urlbuff[0], urllen, fd);
> > @@ -276,6 +280,8 @@
> >      char	dateHexS[sizeof(apr_time_t) * 2 + 1];
> >      char	expireHexS[sizeof(apr_time_t) * 2 + 1];
> >      char	verHexS[sizeof(apr_time_t) * 2 + 1];
> > +    char	requestHexS[sizeof(apr_time_t) * 2 + 1];
> > +    char	responseHexS[sizeof(apr_time_t) * 2 + 1];
> >      cache_info *info = &(h->cache_obj->info);
> >      disk_cache_object_t *dobj = (disk_cache_object_t *) h->cache_obj->vobj;
> >      
> > @@ -287,7 +293,9 @@
> >      ap_cache_usec2hex(info->date, dateHexS);
> >      ap_cache_usec2hex(info->expire, expireHexS);
> >      ap_cache_usec2hex(dobj->version++, verHexS);
> > -    buf = apr_pstrcat(r->pool, dateHexS, " ", expireHexS, " ", verHexS, "\n",
NULL);
> > +    ap_cache_usec2hex(info->request_time, requestHexS);
> > +    ap_cache_usec2hex(info->response_time, responseHexS);
> > +    buf = apr_pstrcat(r->pool, dateHexS, " ", expireHexS, " ", verHexS, " ",
requestHexS, " ", responseHexS, "\n", NULL);
> >      amt = strlen(buf);
> >      rc = apr_file_write(fd, buf, &amt);
> >      if (rc != APR_SUCCESS) {
> > @@ -448,6 +456,7 @@
> >      char urlbuff[1034];
> >      int urllen = sizeof(urlbuff);
> >      disk_cache_object_t *dobj = (disk_cache_object_t *) h->cache_obj->vobj;
> > +    apr_table_t * tmp;
> >  
> >      /* This case should not happen... */
> >      if (!dobj->fd || !dobj->hfd) {
> > @@ -462,7 +471,9 @@
> >      /*
> >       * Call routine to read the header lines/status line 
> >       */
> > -    ap_scan_script_header_err(r, dobj->hfd, NULL);
> > +    ap_scan_script_header_err(r, dobj->hfd, NULL);	/* read headers into err_headers_out
*/
> > +    r->headers_out = r->err_headers_out;	/* these are the real headers, not
error headers */
> > +    rv = apr_file_gets(&urlbuff[0], urllen, dobj->hfd);           /* Read
status  */
> >   
> >      apr_table_setn(r->headers_out, "Content-Type", 
> >                     ap_make_content_type(r, r->content_type));
> > @@ -486,6 +497,13 @@
> >  
> >      r->status_line = apr_pstrdup(r->pool, urlbuff);            /* Save status
line into request rec  */
> >  
> > +    /*
> > +     * Call routine to read the header lines/status line 
> > +     */
> > +    r->err_headers_out = h->req_hdrs;		/* pick up request headers also in
metadata file */
> > +    ap_scan_script_header_err(r, dobj->hfd, NULL);
> > +    r->err_headers_out = apr_table_make(r->pool, 20);
> > + 
> >      apr_file_close(dobj->hfd);
> >  
> >      ap_log_error(APLOG_MARK, APLOG_INFO, 0, r->server,
> > @@ -585,6 +603,23 @@
> >          buf = apr_pstrcat(r->pool, CRLF, NULL);
> >          amt = strlen(buf);
> >          apr_file_write(hfd, buf, &amt);
> > +
> > +	/* Dump the request headers so we can parse the Vary: header and check
> > +	   the entries in potential cachable request against this request. */
> > +        if (r->headers_in) {
> > +            int i;
> > +            apr_table_entry_t *elts = (apr_table_entry_t *) apr_table_elts(r->headers_in)->elts;
> > +            for (i = 0; i < apr_table_elts(r->headers_in)->nelts; ++i)
{
> > +                if (elts[i].key != NULL) {
> > +                    buf = apr_pstrcat(r->pool, elts[i].key, ": ",  elts[i].val,
CRLF, NULL);
> > +                    amt = strlen(buf);
> > +                    apr_file_write(hfd, buf, &amt);
> > +                }
> > +            }
> > +            buf = apr_pstrcat(r->pool, CRLF, NULL);
> > +            amt = strlen(buf);
> > +            apr_file_write(hfd, buf, &amt);
> > +        }
> >          apr_file_close(hfd); /* flush and close */
> >      }
> >      else {
> > 
> > 
> > ------------------------------------------------------------------------
> > 
> > Index: httpd-2.0/modules/experimental/cache_storage.c
> > ===================================================================
> > RCS file: /home/cvspublic/httpd-2.0/modules/experimental/cache_storage.c,v
> > retrieving revision 1.25
> > diff -u -r1.25 cache_storage.c
> > --- httpd-2.0/modules/experimental/cache_storage.c	23 Jun 2002 06:10:00 -0000	1.25
> > +++ httpd-2.0/modules/experimental/cache_storage.c	1 Aug 2002 21:50:10 -0000
> > @@ -283,7 +283,8 @@
> >          return rv;
> >      }
> >  
> > -    r->filename = apr_pstrdup(r->pool, info->filename );
> > +    if (info->filename)
> > +	r->filename = apr_pstrdup(r->pool, info->filename );
> >  
> >      return APR_SUCCESS;
> >  }
> > 
> > Part 1.1
> > 
> > Content-Type:
> > 
> > text/plain
> > 
> > 
> > ------------------------------------------------------------------------
> > mod_disk_cache.patch
> > 
> > Content-Type:
> > 
> > text/plain
> > 
> > 
> > ------------------------------------------------------------------------
> > cache_storage.patch
> > 
> > Content-Type:
> > 
> > text/plain
> > 
> > 
> 
> 
> -- 
> Paul J. Reder
> -----------------------------------------------------------
> "The strength of the Constitution lies entirely in the determination of each
> citizen to defend it.  Only if every single citizen feels duty bound to do
> his share in this defense are the constitutional rights secure."
> -- Albert Einstein
> 

-- 
-eric

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

Mime
View raw message