httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Graham Leggett <minf...@sharp.fm>
Subject Apache caching public-proxy layer...
Date Mon, 09 Oct 2000 16:44:42 GMT
Hi all,

Here is a rough draft of a preliminary design of a caching layer for
Apache. It grew out of mod_proxy, and the need for mod_proxy to handle
HTTP/1.1, which in essence means handling conditional requests using
If-None-Match or If-Modified-Since.

Some highlights of this design are:

- It's based on filters.
- It caches everything inside Apache, not just the proxy.
- It is based on HTTP/1.1 proxy support described in RFC2616.

The caching design is well on it's way, the storage design needs a whole
lot of work, but the concepts are all there.

So - is this design workable? Flame away...

Regards,
Graham
--
====>

mod_cache - The HTTP/1.1 caching engine
=======================================

This document lays out a possible design for a caching layer for the
Apache webserver.

The basic idea is to add a HTTP/1.1 public cache layer on top of the
Apache webserver, with the purpose of serving cached data in eligible
requests, with minimal use of system resources and disk access. The idea
is basically mod_mmap_static on speed.

RFC 2616 HTTP/1.1 already describes how a public cache should behave.
Instead of reinventing the wheel and creating a home cooked cache
mechanism, we use the existing public cache mechanism built into the
protocol to form the basic working model of the cache.

There are two basic functions of a caching system:

- Deciding whether or not a response should be served from a cache (the
caching engine).
- Functions to handle the storing of cache data (the storage engine).


The Caching Engine
==================

The caching engine is made up in total of three distinct parts. Each
part has a simple well defined function:

- The Cache Content Handler
- The Conditional Filter
- The Cache Filter

A very bad ascii drawing of the layout might look this this:

      Original Request
             |
             v
  +-----------------------+    +-------------------------------+
  | Cache Content Handler |--->| Other Apache Content Handlers |
  +-----------------------+    +-------------------------------+
             |                                 |
             +-------->------+-------<---------+
                             |
                             v
                 +----------------------+
                 | Other Apache Filters |
                 |   compression, etc   |
                 +----------------------+
                             |
                             v
                  +--------------------+
                  | Conditional Filter |
                  +--------------------+
                             |
                             v
                      +--------------+
                      | Cache Filter |
                      +--------------+
                             |
                             v
                  +----------------------+
                  | Other Apache Filters |
                  |    chunking, etc     |
                  +----------------------+
                             |
                             v
                          Network


1. The Cache Content Handler
----------------------------

This part's job is to answer the question:

	"Must I serve this object from the cache?"

This answer has three possible answers: YES, NO and MAYBE.

NO:

The answer will be NO if:

- The object isn't in the cache in the first place (duh).
- The headers on the request from the client stipulate that cached
content is not appropriate in a response. (eg Cache-Control: no-cache)

What the handler will do during a NO response:

- Nothing, it will DECLINE the request and let the real content handler
deal with it.
- It might in addition delete the object from the cache, should the
object exist in the cache and a Cache-Control: no-store header is
present in the request.
- It will set the CACHE flag to NOT_IN_CACHE to tell the downstream
Cache Content Filter to that it is allowed to cache a response to this
request.

YES:

The answer will be YES if:

- The object is in the cache.
- The content negotiation matches the object in the cache correctly.
- The object in the cache is FRESH, based on the HTTP/1.1 freshness
algorithm and the Cache-Control: max-age and other headers in the
request.

What the handler will do during a YES response:

- It will return the cached object and respond with an 200 OK, OR it
will return no body and a 304 Not Modified, depending on whether they
original request was conditional or not and whether the conditional
turned up a match or not.
- It will stop the content handler processing by responding with an OK
to Apache.
- It will set the CACHE flag to IN_CACHE to tell the downstream Cache
Content Filter to ignore this request - it was from the cache anyway.

MAYBE:

The answer will be MAYBE if:

- The object is in the cache, AND...
- The content negotiation matches the object in the cache correctly,
AND...
- The object in the cache is NOT FRESH, based on the HTTP/1.1 freshness
algorithm and the Cache-Control: max-age and other headers in the
request.

What the handler will do during a MAYBE response:

- The handler will add a If-Modified-Since and/or If-None-Match header
to the request, AND...
- The handler will return a DECLINED, passing it on to the real content
handler.
- It will set the CACHE flag to CHECK_CACHE to tell the downstream
Conditional Filter that it must decide from the content handler's
response whether to deliver the cached object or the real object from
the real content handler.

At this point - the real content handler has been duped into asking for
a conditional request, instead of an absolute one. The resulting
conditional response will be parsed by the Conditional Filter later in
the chain. If content comes back from the handler, then the content is
passed direct to the browser. If a 304 Not Modified comes through, then
either the cached response or a further 304 Not Modified will be
returned to the browser depending on whether the original response was
conditional or not.


2. The Conditional Filter
-------------------------

This part's job is to ask the question:

	"Should I jump in and handle a duped conditional request?"

This question has two possible answers: YES and NO

YES:

The answer will be YES if:

- The CACHE flag is set earlier to CHECK_CACHE.

The answer will be NO if:

- The CACHE flag is set earlier to something other than CHECK_CACHE.

What the conditional filter will do in a YES response:

- If the response from the underlying content handlers is 304 Not
Modified, then the response to be returned will be the cached object (if
the original request was a normal request) or a 304 Not Modified (if the
original request was a conditional request). The CACHE flag will be
changed to IN_CACHE, indicating to the downstream cache filter that the
content was in the cache and that nothing need be done.
- If the response from the underlying content handlers is anything else,
then the response to be returned will be unmodified, ie it will behave
as a passthrough filter. The CACHE flag will be changed to NOT_IN_CACHE
to indicate to the downstream cache filter that the content was not in
the cache, and should be stored in the cache if possible. 

NO:

The answer will be NO if:

- The CACHE flag is not set to CHECK_CACHE.

What the conditional filter will do in a NO response:

- The filter will do nothing to the content, ie it will behave as a
passthrough filter.


3. The Cache Filter
-------------------

This part's job is to answer the question:

	"Should I put the response into the cache?"

The question has two possible answers, YES and NO.

YES:

The answer will be YES if:

- The CACHE flag is set to NOT_IN_CACHE, indicating that the object is
not in the cache, and consequently should be stored, AND...
- The original request was not conditional OR the original request was
conditional and the response was not 304 Not Modified.
- The headers on the request and response indicated that the object is
allowed to be cached (the Cache-Control and Pragma headers).

NO:

The answer will be NO if:

- The CACHE flag is set to IN_CACHE, indicating that the object is
already in the cache and has been served already, OR...
- The original request was conditional, and the resulting response was
304 Not Modified.


The Storage Engine
==================

The storage engine is responsible for keeping cached objects available
in either shared memory or disk, and managing the cached objects.

Apart from the obvious functions of object storage, the storage engine
has content negotiation capabilities to allow the cache the ability to
store multiple instances of the same object (different representations,
different languages, etc).

It provides the following functions:

- Saving an object representation to memory or disk
- Querying the existance of a cached object representation based on both
URL and request headers (content negotiation)
- Deleting cached objects
- Updating cached object representations
- Reading an object representation from memory or disk

In accordance with the needs of RFC2616 HTTP/1.1, the following
capabilities must be possible:

- The cached object (a binary blob) should be writable to the cache.
- The original request headers should be written along with the cache
object.
- The response headers should be written along with the cache object.
- The response headers of an object in the cache should be updatable, as
required by RFC2616.
- It must be possible to specify whether a cached object should be
cached in memory, disk, or either.
- Multiple negotiated instances of the same object should be separately
cacheble. The decision of which object to return by the storage engine
will be based on content negotiation as described in RFC2616.

Mime
View raw message