Author: jim Date: Mon Mar 19 13:15:36 2007 New Revision: 520078 URL: http://svn.apache.org/viewvc?view=rev&rev=520078 Log: Update the doccos, mainly to start folding in the o-f developer guideline. Added: httpd/httpd/trunk/docs/manual/developer/output-filters.html (with props) httpd/httpd/trunk/docs/manual/developer/output-filters.html.en Modified: httpd/httpd/trunk/docs/manual/mod/mod_disk_cache.xml.ja httpd/httpd/trunk/docs/manual/mod/mod_disk_cache.xml.ko httpd/httpd/trunk/docs/manual/mod/mod_disk_cache.xml.meta Added: httpd/httpd/trunk/docs/manual/developer/output-filters.html URL: http://svn.apache.org/viewvc/httpd/httpd/trunk/docs/manual/developer/output-filters.html?view=auto&rev=520078 ============================================================================== --- httpd/httpd/trunk/docs/manual/developer/output-filters.html (added) +++ httpd/httpd/trunk/docs/manual/developer/output-filters.html Mon Mar 19 13:15:36 2007 @@ -0,0 +1,3 @@ +URI: output-filters.html.en +Content-Language: en +Content-type: text/html; charset=ISO-8859-1 Propchange: httpd/httpd/trunk/docs/manual/developer/output-filters.html ------------------------------------------------------------------------------ svn:eol-style = native Added: httpd/httpd/trunk/docs/manual/developer/output-filters.html.en URL: http://svn.apache.org/viewvc/httpd/httpd/trunk/docs/manual/developer/output-filters.html.en?view=auto&rev=520078 ============================================================================== --- httpd/httpd/trunk/docs/manual/developer/output-filters.html.en (added) +++ httpd/httpd/trunk/docs/manual/developer/output-filters.html.en Mon Mar 19 13:15:36 2007 @@ -0,0 +1,473 @@ + + + +Guide to writing output filters - Apache HTTP Server + + + + + +
<-
+
+Apache > HTTP Server > Documentation > Version 2.3 > Developer Documentation

Guide to writing output filters

+
+

Available Languages:  en 

+
+ +

There are a number of common pitfalls encountered when writing + output filters; this page aims to document best practice for + authors of new or existing filters.

+ +

This document is applicable to both version 2.0 and version 2.2 + of the Apache HTTP Server; it specifically targets + RESOURCE-level or CONTENT_SET-level + filters though some advice is generic to all types of filter.

+
+
+
top
+
+

Filters and bucket brigades

+ + +

Each time a filter is invoked, it is passed a bucket + brigade, containing a sequence of buckets which + represent both data content and metadata. Every bucket has a + bucket type; a number of bucket types are defined and + used by the httpd core modules (and the + apr-util library which provides the bucket brigade + interface), but modules are free to define their own types.

+ +
Output filters must be prepared to process + buckets of non-standard types; with a few exceptions, a filter + need not care about the types of buckets being filtered.
+ +

A filter can tell whether a bucket represents either data or + metadata using the APR_BUCKET_IS_METADATA macro. + Generally, all metadata buckets should be passed down the filter + chain by an output filter. Filters may transform, delete, and + insert data buckets as appropriate.

+ +

There are two metadata bucket types which all filters must pay + attention to: the EOS bucket type, and the + FLUSH bucket type. An EOS bucket + indicates that the end of the response has been reached and no + further buckets need be processed. A FLUSH bucket + indicates that the filter should flush any buffered buckets (if + applicable) down the filter chain immediately.

+ +
FLUSH buckets are sent when the + content generator (or an upstream filter) knows that there may be + a delay before more content can be sent. By passing + FLUSH buckets down the filter chain immediately, + filters ensure that the client is not kept waiting for pending + data longer than necessary.
+ +

Filters can create FLUSH buckets and pass these + down the filter chain if desired. Generating FLUSH + buckets unnecessarily, or too frequently, can harm network + utilisation since it may force large numbers of small packets to + be sent, rather than a small number of larger packets. The + section on Non-blocking bucket reads + covers a case where filters are encouraged to generate + FLUSH buckets.

+ +

Example bucket brigade

HEAP FLUSH FILE EOS
+ +

This shows a bucket brigade which may be passed to a filter; it + contains two metadata buckets (FLUSH and + EOS), and two data buckets (HEAP and + FILE).

+ +
top
+
+

Filter invocation

+ + +

For any given request, an output filter might be invoked only + once and be given a single brigade representing the entire response. + It is also possible that the number of times a filter is invoked + for a single response is proportional to the size of the content + being filtered, with the filter being passed a brigade containing + a single bucket each time. Filters must operate correctly in + either case.

+ +
An output filter which allocates long-lived + memory every time it is invoked may consume memory proportional to + response size. Output filters which need to allocate memory + should do so once per response; see Maintaining + state below.
+ +

An output filter can distinguish the final invocation for a + given response by the presence of an EOS bucket in + the brigade. Any buckets in the brigade after an EOS should be + ignored.

+ +

An output filter should never pass an empty brigade down the + filter chain. But, for good defensive programming, filters should + be prepared to accept an empty brigade, and do nothing.

+ +

How to handle an empty brigade

apr_status_t dummy_filter(ap_filter_t *f, apr_bucket_brigade *bb)
+{
+    if (APR_BRIGADE_EMPTY(bb)) {
+        return APR_SUCCESS;
+    }
+    ....
+ +
top
+
+

Brigade structure

+ + +

A bucket brigade is a doubly-linked list of buckets. The list + is terminated (at both ends) by a sentinel which can be + distinguished from a normal bucket by comparing it with the + pointer returned by APR_BRIGADE_SENTINEL. The list + sentinel is in fact not a valid bucket structure; any attempt to + call normal bucket functions (such as + apr_bucket_read) on the sentinel will have undefined + behaviour (i.e. will crash the process).

+ +

There are a variety of functions and macros for traversing and + manipulating bucket brigades; see the apr_bucket.h + header for complete coverage. Commonly used macros include: + +

+
APR_BRIGADE_FIRST(bb)
+
returns the first bucket in brigade bb
+ +
APR_BRIGADE_LAST(bb)
+
returns the last bucket in brigade bb
+ +
APR_BUCKET_NEXT(e)
+
gives the next bucket after bucket e
+ +
APR_BUCKET_PREV(e)
+
gives the bucket before bucket e
+ +

+ +

The apr_bucket_brigade structure itself is + allocated out of a pool, so if a filter creates a new brigade, it + must ensure that memory use is correctly bounded. A filter which + allocates a new brigade out of the request pool + (r->pool) on every invocation, for example, will fall + foul of the warning above concerning + memory use. Such a filter should instead create a brigade on the + first invocation per request, and store that brigade in its state structure.

+ +

It is generally never advisable to use + apr_brigade_destroy to "destroy" a brigade unless + you know for certain that the brigade will never be used + again, even then, it should be used rarely. The + memory used by the brigade structure will not be released by + calling this function (since it comes from a pool), but the + associated pool cleanup is unregistered. Using + apr_brigade_destroy can in fact cause memory leaks; + if a "destroyed" brigade contains buckets when its + containing pool is destroyed, those buckets will not be + immediately destroyed.

+ +

In general, filters should use apr_brigade_cleanup + in preference to apr_brigade_destroy.

+ +
top
+
+

Processing buckets

+ + + +

When dealing with non-metadata buckets, it is important to + understand that the "apr_bucket *" object is an + abstract representation of data: + +

    +
  1. The amount of data represented by the bucket may or may not + have a determinate length; for a bucket which represents data of + indeterminate length, the ->length field is set to + the value (apr_size_t)-1. For example, buckets of + the PIPE bucket type have an indeterminate length; + they represent the output from a pipe.
  2. + +
  3. The data represented by a bucket may or may not be mapped + into memory. The FILE bucket type, for example, + represents data stored in a file on disk.
  4. +
+ + Filters read the data from a bucket using the + apr_bucket_read function. When this function is + invoked, the bucket may morph into a different bucket + type, and may also insert a new bucket into the bucket brigade. + This must happen for buckets which represent data not mapped into + memory.

+ +

To give an example; consider a bucket brigade containing a + single FILE bucket representing an entire file, 24 + kilobytes in size:

+ +
FILE(0K-24K)
+ +

When this bucket is read, it will read a block of data from the + file, morph into a HEAP bucket to represent that + data, and return the data to the caller. It also inserts a new + FILE bucket representing the remainder of the file; + after the apr_bucket_read call, the brigade looks + like:

+ +
HEAP(8K) FILE(8K-24K)
+ +
top
+
+

Filtering brigades

+ + +

The basic function of any output filter will be to iterate + through the passed-in brigade and transform (or simply examine) + the content in some manner. The implementation of the iteration + loop is critical to producing a well-behaved output filter.

+ +

Taking an example which loops through the entire brigade as + follows: + +

Bad output filter -- do not imitate!

apr_bucket *e = APR_BRIGADE_FIRST(bb);
+const char *data;
+apr_size_t len;
+
+while (e != APR_BRIGADE_SENTINEL(bb)) {
+   apr_bucket_read(e, &data, &length, APR_BLOCK_READ);
+   e = APR_BUCKET_NEXT(e);
+}
+
+return ap_pass_brigade(bb);
+ + The above implementation would consume memory proportional to + content size. If passed a FILE bucket, for example, + the entire file contents would be read into memory as each + apr_bucket_read call morphed a FILE + bucket into a HEAP bucket.

+ +

In contrast, the implementation below will consume a fixed + amount of memory to filter any brigade; a temporary brigade is + needed and must be allocated only once per response, see the Maintaining state section.

+ +

Better output filter

apr_bucket *e;
+const char *data;
+apr_size_t len;
+
+while ((e = APR_BRIGADE_FIRST(bb)) != APR_BRIGADE_SENTINEL(bb)) {
+   rv = apr_bucket_read(e, &data, &length, APR_BLOCK_READ);
+   if (rv) ...;
+   /* Remove bucket e from bb. */
+   APR_BUCKET_REMOVE(e);
+   /* Insert it into  temporary brigade. */
+   APR_BRIGADE_INSERT_HEAD(tmpbb, e);
+   /* Pass brigade downstream. */
+   rv = ap_pass_brigade(f->next, tmpbb);
+   if (rv) ...;
+   apr_brigade_cleanup(tmpbb);
+}
+ +
top
+
+

Maintaining state

+ + + +

A filter which needs to maintain state over multiple + invocations per response can use the ->ctx field of + its ap_filter_t structure. It is typical to store a + temporary brigade in such a structure, to avoid having to allocate + a new brigade per invocation as described in the Brigade structure section.

+ +

Example code to maintain filter state

struct dummy_state {
+   apr_bucket_brigade *tmpbb;
+   int filter_state;
+   ....
+};
+
+apr_status_t dummy_filter(ap_filter_t *f, apr_bucket_brigade *bb)
+{
+    struct dummy_state *state;
+
+    state = f->ctx;
+    if (state == NULL) {
+       /* First invocation for this response: initialise state structure. */
+       f->ctx = state = apr_palloc(sizeof *state, f->r->pool);
+       
+       state->tmpbb = apr_brigade_create(f->r->pool, f->c->bucket_alloc);
+       state->filter_state = ...;
+    }
+    ...
+ +
top
+
+

Buffering buckets

+ + +

If a filter decides to store buckets beyond the duration of a + single filter function invocation (for example storing them in its + ->ctx state structure), those buckets must be set + aside. This is necessary because some bucket types provide + buckets which represent temporary resources (such as stack memory) + which will fall out of scope as soon as the filter chain completes + processing the brigade.

+ +

To setaside a bucket, the apr_bucket_setaside + function can be called. Not all bucket types can be setaside, but + if successful, the bucket will have morphed to ensure it has a + lifetime at least as long as the pool given as an argument to the + apr_bucket_setaside function.

+ +

Alternatively, the ap_save_brigade function can be + used, which will move all the buckets into a separate brigade + containing buckets with a lifetime as long as the given pool + argument. This function must be used with care, taking into + account the following points: + +

    +
  1. On return, ap_save_brigade guarantees that all + the buckets in the returned brigade will represent data mapped + into memory. If given an input brigade containing, for example, + a PIPE bucket, ap_save_brigade will + consume an arbitrary amount of memory to store the entire output + of the pipe.
  2. + +
  3. When ap_save_brigade reads from buckets which + cannot be setaside, it will always perform blocking reads, + removing the opportunity to use Non-blocking + bucket reads.
  4. + +
  5. If ap_save_brigade is used without passing a + non-NULL "saveto" (destination) brigade parameter, + the function will create a new brigade, which may cause memory + use to be proportional to content size as described in the Brigade structure section.
  6. +

+ +
Filters must ensure that any buffered data is + processed and passed down the filter chain during the last + invocation for a given response (a brigade containing an EOS + bucket). Otherwise such data will be lost.
+ +
top
+
+

Non-blocking bucket reads

+ + +

The apr_bucket_read function takes an + apr_read_type_e argument which determines whether a + blocking or non-blocking read will be performed + from the data source. A good filter will first attempt to read + from every data bucket using a non-blocking read; if that fails + with APR_EAGAIN, then send a FLUSH + bucket down the filter chain, and retry using a blocking read.

+ +

This mode of operation ensure that any filters further down the + filter chain will flush any buffered buckets if a slow content + source is being used.

+ +

A CGI script is an example of a slow content source which is + implemented as a bucket type. mod_cgi will send + PIPE buckets which represent the output from a CGI + script; reading from such a bucket will block when waiting for the + CGI script to produce more output.

+ +

Example code using non-blocking bucket reads

apr_bucket *e;
+apr_read_type_e mode = APR_NONBLOCK_READ;
+
+while ((e = APR_BRIGADE_FIRST(bb)) != APR_BRIGADE_SENTINEL(bb)) {
+    apr_status_t rv;
+
+    rv = apr_bucket_read(e, &data, &length, mode);
+    if (rv == APR_EAGAIN && mode == APR_NONBLOCK_READ) {
+        /* Pass down a brigade containing a flush bucket: */
+        APR_BRIGADE_INSERT_TAIL(tmpbb, apr_bucket_flush_create(...));
+        rv = ap_pass_brigade(f->next, tmpbb);
+        apr_brigade_cleanup(tmpbb);
+        if (rv != APR_SUCCESS) return rv;
+
+        /* Retry, using a blocking read. */
+        mode = APR_BLOCK_READ;
+        continue;
+    } else if (rv != APR_SUCCESS) { 
+        /* handle errors */
+    }
+
+    /* Next time, try a non-blocking read first. */
+    mode = APR_NONBLOCK_READ;
+    ...
+}
+ +
top
+
+

Ten rules for output filters

+ + +

In summary, here is a set of rules for all output filters to + follow:

+ +
    +
  1. Output filters should not pass empty brigades down the filter + chain, but should be tolerant of being passed empty + brigades.
  2. + +
  3. Output filters must pass all metadata buckets down the filter + chain; FLUSH buckets should be respected by passing + any pending or buffered buckets down the filter chain.
  4. + +
  5. Output filters should ignore any buckets following an + EOS bucket.
  6. + +
  7. Output filters must process a fixed amount of data at a + time, to ensure that memory consumption is not proportional to + the size of the content being filtered.
  8. + +
  9. Output filters should be agnostic with respect to bucket + types, and must be able to process buckets of unfamiliar + type.
  10. + +
  11. After calling ap_pass_brigade to pass a brigade + down the filter chain, output filters should call + apr_brigade_cleanup to ensure the brigade is empty + before reusing that brigade structure; output filters should + never use apr_brigade_destroy to "destroy" + brigades.
  12. + +
  13. Output filters must setaside any buckets which are + preserved beyond the duration of the filter function.
  14. + +
  15. Output filters must not ignore the return value of + ap_pass_brigade, and must return appropriate errors + back up the filter chain.
  16. + +
  17. Output filters must only create a fixed number of bucket + brigades for each response, rather than one per invocation.
  18. + +
  19. Output filters should first attempt non-blocking reads from + each data bucket, and send a FLUSH bucket down the + filter chain if the read blocks, before retrying with a blocking + read.
  20. + +
+ +
+
+

Available Languages:  en 

+
+ \ No newline at end of file Modified: httpd/httpd/trunk/docs/manual/mod/mod_disk_cache.xml.ja URL: http://svn.apache.org/viewvc/httpd/httpd/trunk/docs/manual/mod/mod_disk_cache.xml.ja?view=diff&rev=520078&r1=520077&r2=520078 ============================================================================== --- httpd/httpd/trunk/docs/manual/mod/mod_disk_cache.xml.ja [iso-2022-jp] (original) +++ httpd/httpd/trunk/docs/manual/mod/mod_disk_cache.xml.ja [iso-2022-jp] Mon Mar 19 13:15:36 2007 @@ -1,7 +1,7 @@ - + +