httpd-cvs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rbo...@apache.org
Subject cvs commit: httpd-2.0/docs/manual/misc custom_errordocs.html descriptors.html fin_wait_2.html footer.html header.html index.html known_client_problems.html perf-tuning.html rewriteguide.html security_tips.html tutorials.html
Date Sat, 22 Sep 2001 19:33:41 GMT
rbowen      01/09/22 12:33:41

  Modified:    docs/manual/misc custom_errordocs.html descriptors.html
                        fin_wait_2.html footer.html header.html index.html
                        known_client_problems.html perf-tuning.html
                        rewriteguide.html security_tips.html tutorials.html
  Log:
  w3c tidy to convert to xhtml
  
  Revision  Changes    Path
  1.9       +370 -314  httpd-2.0/docs/manual/misc/custom_errordocs.html
  
  Index: custom_errordocs.html
  ===================================================================
  RCS file: /home/cvs/httpd-2.0/docs/manual/misc/custom_errordocs.html,v
  retrieving revision 1.8
  retrieving revision 1.9
  diff -u -r1.8 -r1.9
  --- custom_errordocs.html	2001/02/10 23:33:36	1.8
  +++ custom_errordocs.html	2001/09/22 19:33:40	1.9
  @@ -1,144 +1,186 @@
  -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
  -<HTML>
  -<HEAD>
  -<TITLE>International Customized Server Error Messages</TITLE>
  -</HEAD>
  -
  -<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  -<BODY
  - BGCOLOR="#FFFFFF"
  - TEXT="#000000"
  - LINK="#0000FF"
  - VLINK="#000080"
  - ALINK="#FF0000"
  ->
  -<!--#include virtual="header.html" -->
  -
  -<H1 ALIGN="CENTER">Using XSSI and <SAMP>ErrorDocument</SAMP> to configure
  -customized international server error responses</H1>
  -<P>
  -<H2>Index</H2>
  -<UL>
  - <LI><A HREF="#intro">Introduction</A>
  - <LI><A HREF="#createdir">Creating an ErrorDocument directory</A>
  - <LI><A HREF="#docnames">Naming the individual error document files</A>
  - <LI><A HREF="#headfoot">The common header and footer files</A>
  - <LI><A HREF="#createdocs">Creating ErrorDocuments in different languages</A>
  - <LI><A HREF="#fallback">The fallback language</A>
  - <LI><A HREF="#proxy">Customizing Proxy Error Messages</A>
  - <LI><A HREF="#listings">HTML listing of the discussed example</A>
  -</UL>
  -<HR>
  -<H2><A NAME="intro">Introduction</A></H2>
  -This document describes an easy way to provide your apache WWW server
  -with a set of customized error messages which take advantage of
  -<A HREF="../content-negotiation.html">Content Negotiation</A>
  -and <A HREF="../mod/mod_include.html">eXtended Server Side Includes (XSSI)</A>
  -to return error messages generated by the server in the client's
  -native language.
  -</P>
  -<P>
  -By using XSSI, all
  -<A HREF="../mod/core.html#errordocument">customized messages</A>
  -can share a homogenous and consistent style and layout, and maintenance work
  -(changing images, changing links) is kept to a minimum because all layout
  -information can be kept in a single file.<BR>
  -Error documents can be shared across different servers, or even hosts,
  -because all varying information is inserted at the time the error document
  -is returned on behalf of a failed request.
  -</P>
  -<P>
  -Content Negotiation then selects the appropriate language version of a
  -particular error message text, honoring the language preferences passed
  -in the client's request. (Users usually select their favorite languages
  -in the preferences options menu of today's browsers). When an error
  -document in the client's primary language version is unavailable, the 
  -secondary languages are tried or a default (fallback) version is used.
  -</P>
  -<P>
  -You have full flexibility in designing your error documents to
  -your personal taste (or your company's conventions). For demonstration
  -purposes, we present a simple generic error document scheme.
  -For this hypothetic server, we assume that all error messages...
  -<UL>
  -<LI>possibly are served by different virtual hosts (different host name,
  -    different IP address, or different port) on the server machine,
  -<LI>show a predefined company logo in the right top of the message
  -    (selectable by virtual host),
  -<LI>print the error title first, followed by an explanatory text and
  -    (depending on the error context) help on how to resolve the error,
  -<LI>have some kind of standardized background image,
  -<LI>display an apache logo and a feedback email address at the bottom
  -    of the error message.
  -</UL>
  -</P>
  -
  -<P>
  -An example of a "document not found" message for a german client might
  -look like this:<BR>
  -<IMG SRC="../images/custom_errordocs.gif"
  - ALT="[Needs graphics capability to display]"><BR>
  -All links in the document as well as links to the server's administrator
  -mail address, and even the name and port of the serving virtual host
  -are inserted in the error document at "run-time", <EM>i.e.</EM>, when the error
  -actually occurs.
  -</P>
  -
  -<H2><A NAME="createdir">Creating an ErrorDocument directory</A></H2>
  -
  -For this concept to work as easily as possible, we must take advantage
  -of as much server support as we can get:
  -<OL>
  - <LI>By defining the <A HREF="../mod/core.html#options">MultiViews option</A>,
  -     we enable the language selection of the most appropriate language
  -     alternative (content negotiation). 
  - <LI>By setting the
  -     <A HREF="../mod/mod_negotiation.html#languagepriority"
  -     >LanguagePriority</A> 
  -     directive we define a set of default fallback languages in the situation
  -     where the client's browser did not express any preference at all.
  - <LI>By enabling <A HREF="../mod/mod_include.html">Server Side Includes</A>
  -     (and disallowing execution of cgi scripts for security reasons),
  -     we allow the server to include building blocks of the error message,
  -     and to substitute the value of certain environment variables into the
  -     generated document (dynamic HTML) or even to conditionally include
  -     or omit parts of the text.
  - <LI>The <A HREF="../mod/mod_mime.html#addhandler">AddHandler</A> and
  -     <A HREF="../mod/mod_mime.html#addtype">AddType</A> directives are useful
  -     for automatically XSSI-expanding all files with a <SAMP>.shtml</SAMP>
  -     suffix to <EM>text/html</EM>.
  - <LI>By using the <A HREF="../mod/mod_alias.html#alias">Alias</A> directive,
  -     we keep the error document directory outside of the document tree
  -     because it can be regarded more as a server part than part of
  -     the document tree.
  - <LI>The <A HREF="../mod/core.html#directory">&lt;Directory&gt;</A>-Block
  -     restricts these "special" settings to the error document directory
  -     and avoids an impact on any of the settings for the regular document tree.
  - <LI>For each of the error codes to be handled (see RFC2068 for an exact
  -     description of each error code, or look at
  -     <CODE>src/main/http_protocol.c</CODE>
  -     if you wish to see apache's standard messages), an
  -     <A HREF="../mod/core.html#errordocument">ErrorDocument</A>
  -     in the aliased <SAMP>/errordocs</SAMP> directory is defined.
  -     Note that we only define the basename of the document here
  -     because the MultiViews option will select the best candidate
  -     based on the language suffixes and the client's preferences.
  -     Any error situation with an error code <EM>not</EM> handled by a
  -     custom document will be dealt with by the server in the standard way
  -     (<EM>i.e.</EM>, a plain error message in english).
  - <LI>Finally, the <A HREF="../mod/core.html#allowoverride">AllowOverride</A>
  -     directive tells apache that it is not necessary to look for 
  -     a .htaccess file in the /errordocs directory: a minor speed 
  -     optimization.
  -</OL>
  -The resulting <SAMP>httpd.conf</SAMP> configuration would then look
  -similar to this: <SMALL>(Note that you can define your own error
  -messages using this method for only part of the document tree,
  -e.g., a /~user/ subtree. In this case, the configuration could as well
  -be put into the .htaccess file at the root of the subtree, and
  -the &lt;Directory&gt; and &lt;/Directory&gt; directives -but not
  -the contained directives- must be omitted.)</SMALL>
  -<PRE>
  +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  +    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  +
  +<html xmlns="http://www.w3.org/1999/xhtml">
  +  <head>
  +    <meta name="generator" content="HTML Tidy, see www.w3.org" />
  +
  +    <title>International Customized Server Error Messages</title>
  +  </head>
  +  <!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  +
  +  <body bgcolor="#FFFFFF" text="#000000" link="#0000FF"
  +  vlink="#000080" alink="#FF0000">
  +    <!--#include virtual="header.html" -->
  +
  +    <h1 align="CENTER">Using XSSI and <samp>ErrorDocument</samp> to
  +    configure customized international server error responses</h1>
  +
  +    <h2>Index</h2>
  +
  +    <ul>
  +      <li><a href="#intro">Introduction</a></li>
  +
  +      <li><a href="#createdir">Creating an ErrorDocument
  +      directory</a></li>
  +
  +      <li><a href="#docnames">Naming the individual error document
  +      files</a></li>
  +
  +      <li><a href="#headfoot">The common header and footer
  +      files</a></li>
  +
  +      <li><a href="#createdocs">Creating ErrorDocuments in
  +      different languages</a></li>
  +
  +      <li><a href="#fallback">The fallback language</a></li>
  +
  +      <li><a href="#proxy">Customizing Proxy Error
  +      Messages</a></li>
  +
  +      <li><a href="#listings">HTML listing of the discussed
  +      example</a></li>
  +    </ul>
  +    <hr />
  +
  +    <h2><a id="intro" name="intro">Introduction</a></h2>
  +    This document describes an easy way to provide your apache WWW
  +    server with a set of customized error messages which take
  +    advantage of <a href="../content-negotiation.html">Content
  +    Negotiation</a> and <a href="../mod/mod_include.html">eXtended
  +    Server Side Includes (XSSI)</a> to return error messages
  +    generated by the server in the client's native language. <br />
  +     <br />
  +     
  +
  +    <p>By using XSSI, all <a
  +    href="../mod/core.html#errordocument">customized messages</a>
  +    can share a homogenous and consistent style and layout, and
  +    maintenance work (changing images, changing links) is kept to a
  +    minimum because all layout information can be kept in a single
  +    file.<br />
  +     Error documents can be shared across different servers, or
  +    even hosts, because all varying information is inserted at the
  +    time the error document is returned on behalf of a failed
  +    request.</p>
  +
  +    <p>Content Negotiation then selects the appropriate language
  +    version of a particular error message text, honoring the
  +    language preferences passed in the client's request. (Users
  +    usually select their favorite languages in the preferences
  +    options menu of today's browsers). When an error document in
  +    the client's primary language version is unavailable, the
  +    secondary languages are tried or a default (fallback) version
  +    is used.</p>
  +
  +    <p>You have full flexibility in designing your error documents
  +    to your personal taste (or your company's conventions). For
  +    demonstration purposes, we present a simple generic error
  +    document scheme. For this hypothetic server, we assume that all
  +    error messages...</p>
  +
  +    <ul>
  +      <li>possibly are served by different virtual hosts (different
  +      host name, different IP address, or different port) on the
  +      server machine,</li>
  +
  +      <li>show a predefined company logo in the right top of the
  +      message (selectable by virtual host),</li>
  +
  +      <li>print the error title first, followed by an explanatory
  +      text and (depending on the error context) help on how to
  +      resolve the error,</li>
  +
  +      <li>have some kind of standardized background image,</li>
  +
  +      <li>display an apache logo and a feedback email address at
  +      the bottom of the error message.</li>
  +    </ul>
  +    <br />
  +     <br />
  +     
  +
  +    <p>An example of a "document not found" message for a german
  +    client might look like this:<br />
  +     <img src="../images/custom_errordocs.gif"
  +    alt="[Needs graphics capability to display]" /><br />
  +     All links in the document as well as links to the server's
  +    administrator mail address, and even the name and port of the
  +    serving virtual host are inserted in the error document at
  +    "run-time", <em>i.e.</em>, when the error actually occurs.</p>
  +
  +    <h2><a id="createdir" name="createdir">Creating an
  +    ErrorDocument directory</a></h2>
  +    For this concept to work as easily as possible, we must take
  +    advantage of as much server support as we can get: 
  +
  +    <ol>
  +      <li>By defining the <a
  +      href="../mod/core.html#options">MultiViews option</a>, we
  +      enable the language selection of the most appropriate
  +      language alternative (content negotiation).</li>
  +
  +      <li>By setting the <a
  +      href="../mod/mod_negotiation.html#languagepriority">LanguagePriority</a>
  +      directive we define a set of default fallback languages in
  +      the situation where the client's browser did not express any
  +      preference at all.</li>
  +
  +      <li>By enabling <a href="../mod/mod_include.html">Server Side
  +      Includes</a> (and disallowing execution of cgi scripts for
  +      security reasons), we allow the server to include building
  +      blocks of the error message, and to substitute the value of
  +      certain environment variables into the generated document
  +      (dynamic HTML) or even to conditionally include or omit parts
  +      of the text.</li>
  +
  +      <li>The <a
  +      href="../mod/mod_mime.html#addhandler">AddHandler</a> and <a
  +      href="../mod/mod_mime.html#addtype">AddType</a> directives
  +      are useful for automatically XSSI-expanding all files with a
  +      <samp>.shtml</samp> suffix to <em>text/html</em>.</li>
  +
  +      <li>By using the <a
  +      href="../mod/mod_alias.html#alias">Alias</a> directive, we
  +      keep the error document directory outside of the document
  +      tree because it can be regarded more as a server part than
  +      part of the document tree.</li>
  +
  +      <li>The <a
  +      href="../mod/core.html#directory">&lt;Directory&gt;</a>-Block
  +      restricts these "special" settings to the error document
  +      directory and avoids an impact on any of the settings for the
  +      regular document tree.</li>
  +
  +      <li>For each of the error codes to be handled (see RFC2068
  +      for an exact description of each error code, or look at
  +      <code>src/main/http_protocol.c</code> if you wish to see
  +      apache's standard messages), an <a
  +      href="../mod/core.html#errordocument">ErrorDocument</a> in
  +      the aliased <samp>/errordocs</samp> directory is defined.
  +      Note that we only define the basename of the document here
  +      because the MultiViews option will select the best candidate
  +      based on the language suffixes and the client's preferences.
  +      Any error situation with an error code <em>not</em> handled
  +      by a custom document will be dealt with by the server in the
  +      standard way (<em>i.e.</em>, a plain error message in
  +      english).</li>
  +
  +      <li>Finally, the <a
  +      href="../mod/core.html#allowoverride">AllowOverride</a>
  +      directive tells apache that it is not necessary to look for a
  +      .htaccess file in the /errordocs directory: a minor speed
  +      optimization.</li>
  +    </ol>
  +    The resulting <samp>httpd.conf</samp> configuration would then
  +    look similar to this: <small>(Note that you can define your own
  +    error messages using this method for only part of the document
  +    tree, e.g., a /~user/ subtree. In this case, the configuration
  +    could as well be put into the .htaccess file at the root of the
  +    subtree, and the &lt;Directory&gt; and &lt;/Directory&gt;
  +    directives -but not the contained directives- must be
  +    omitted.)</small> 
  +<pre>
     LanguagePriority en fr de 
     Alias  /errordocs  /usr/local/apache/errordocs
     &lt;Directory /usr/local/apache/errordocs&gt;
  @@ -159,152 +201,162 @@
     ErrorDocument  404  /errordocs/404
     #    "500 Internal Server Error",
     ErrorDocument  500  /errordocs/500
  -</PRE>
  -The directory for the error messages (here:
  -<SAMP>/usr/local/apache/errordocs/</SAMP>) must then be created with the
  -appropriate permissions (readable and executable by the server uid or gid, 
  -only writable for the administrator).
  -
  -<H3><A NAME="docnames">Naming the individual error document files</A></H3>
  -
  -By defining the <SAMP>MultiViews</SAMP> option, the server was told to
  -automatically scan the directory for matching variants (looking at language
  -and content type suffixes) when a requested document was not found.
  -In the configuration, we defined the names for the error documents to be
  -just their error number (without any suffix).
  -<P>
  -The names of the individual error documents are now determined like this
  -(I'm using 403 as an example, think of it as a placeholder for any of
  -the configured error documents):
  -<UL>
  - <LI>No file errordocs/403 should exist. Otherwise, it would be found and
  -     served (with the DefaultType, usually text/plain), all negotiation
  -     would be bypassed.
  - <LI>For each language for which we have an internationalized version
  -     (note that this need not be the same set of languages for each
  -     error code - you can get by with a single language version until
  -     you actually <EM>have</EM> translated versions), a document
  -     <SAMP>errordocs/403.shtml.<EM>lang</EM></SAMP> is created and
  -     filled with the error text in that language (<A HREF="#createdocs">see
  -     below</A>).
  - <LI>One fallback document called <SAMP>errordocs/403.shtml</SAMP> is
  -     created, usually by creating a symlink to the default language
  -     variant (<A HREF="#fallback">see below</A>).
  -</UL>
  -
  -<H3><A NAME="headfoot">The common header and footer files</A></H3>
  -
  -By putting as much layout information in two special "include files", 
  -the error documents can be reduced to a bare minimum.
  -<P>
  -One of these layout files defines the HTML document header
  -and a configurable list of paths to the icons to be shown in the resulting
  -error document. These paths are exported as a set of XSSI environment
  -variables and are later evaluated by the "footer" special file.
  -The title of the current error (which is
  -put into the TITLE tag and an H1 header) is simply passed in from the main
  -error document in a variable called <CODE>title</CODE>.<BR>
  -<STRONG>By changing this file, the layout of all generated error
  -messages can be changed in a second.</STRONG>
  -(By exploiting the features of XSSI, you can easily define different
  -layouts based on the current virtual host, or even based on the
  -client's domain name).
  -<P>
  -The second layout file describes the footer to be displayed at the bottom
  -of every error message. In this example, it shows an apache logo, the current
  -server time, the server version string and adds a mail reference to the
  -site's webmaster.
  -<P>
  -For simplicity, the header file is simply called <CODE>head.shtml</CODE>
  -because it contains server-parsed content but no language specific
  -information. The footer file exists once for each language translation,
  -plus a symlink for the default language.<P>
  -<STRONG>Example:</STRONG> for English, French and German versions
  -(default english)<BR>
  -<CODE>foot.shtml.en</CODE>,<BR>
  -<CODE>foot.shtml.fr</CODE>,<BR>
  -<CODE>foot.shtml.de</CODE>,<BR>
  -<CODE>foot.shtml</CODE> symlink to <CODE>foot.shtml.en</CODE><P>
  -Both files are included into the error document by using the
  -directives <CODE>&lt;!--#include virtual="head" --&gt;</CODE>
  -and        <CODE>&lt;!--#include virtual="foot" --&gt;</CODE>
  -respectively: the rest of the magic occurs in mod_negotiation and
  -in mod_include.
  -<P>
  -
  -See <A HREF="#listings">the listings below</A> to see an actual HTML
  -implementation of the discussed example.
  -
  -
  -<H3><A NAME="createdocs">Creating ErrorDocuments in different languages</A>
  -</H3>
  -
  -After all this preparation work, little remains to be said about the
  -actual documents. They all share a simple common structure:
  -<PRE>
  -&lt;!--#set var="title" value="<EM>error description title</EM>" --&gt;
  +</pre>
  +    The directory for the error messages (here:
  +    <samp>/usr/local/apache/errordocs/</samp>) must then be created
  +    with the appropriate permissions (readable and executable by
  +    the server uid or gid, only writable for the administrator). 
  +
  +    <h3><a id="docnames" name="docnames">Naming the individual
  +    error document files</a></h3>
  +    By defining the <samp>MultiViews</samp> option, the server was
  +    told to automatically scan the directory for matching variants
  +    (looking at language and content type suffixes) when a
  +    requested document was not found. In the configuration, we
  +    defined the names for the error documents to be just their
  +    error number (without any suffix). 
  +
  +    <p>The names of the individual error documents are now
  +    determined like this (I'm using 403 as an example, think of it
  +    as a placeholder for any of the configured error
  +    documents):</p>
  +
  +    <ul>
  +      <li>No file errordocs/403 should exist. Otherwise, it would
  +      be found and served (with the DefaultType, usually
  +      text/plain), all negotiation would be bypassed.</li>
  +
  +      <li>For each language for which we have an internationalized
  +      version (note that this need not be the same set of languages
  +      for each error code - you can get by with a single language
  +      version until you actually <em>have</em> translated
  +      versions), a document
  +      <samp>errordocs/403.shtml.<em>lang</em></samp> is created and
  +      filled with the error text in that language (<a
  +      href="#createdocs">see below</a>).</li>
  +
  +      <li>One fallback document called
  +      <samp>errordocs/403.shtml</samp> is created, usually by
  +      creating a symlink to the default language variant (<a
  +      href="#fallback">see below</a>).</li>
  +    </ul>
  +
  +    <h3><a id="headfoot" name="headfoot">The common header and
  +    footer files</a></h3>
  +    By putting as much layout information in two special "include
  +    files", the error documents can be reduced to a bare minimum. 
  +
  +    <p>One of these layout files defines the HTML document header
  +    and a configurable list of paths to the icons to be shown in
  +    the resulting error document. These paths are exported as a set
  +    of XSSI environment variables and are later evaluated by the
  +    "footer" special file. The title of the current error (which is
  +    put into the TITLE tag and an H1 header) is simply passed in
  +    from the main error document in a variable called
  +    <code>title</code>.<br />
  +     <strong>By changing this file, the layout of all generated
  +    error messages can be changed in a second.</strong> (By
  +    exploiting the features of XSSI, you can easily define
  +    different layouts based on the current virtual host, or even
  +    based on the client's domain name).</p>
  +
  +    <p>The second layout file describes the footer to be displayed
  +    at the bottom of every error message. In this example, it shows
  +    an apache logo, the current server time, the server version
  +    string and adds a mail reference to the site's webmaster.</p>
  +
  +    <p>For simplicity, the header file is simply called
  +    <code>head.shtml</code> because it contains server-parsed
  +    content but no language specific information. The footer file
  +    exists once for each language translation, plus a symlink for
  +    the default language.</p>
  +
  +    <p><strong>Example:</strong> for English, French and German
  +    versions (default english)<br />
  +     <code>foot.shtml.en</code>,<br />
  +     <code>foot.shtml.fr</code>,<br />
  +     <code>foot.shtml.de</code>,<br />
  +     <code>foot.shtml</code> symlink to
  +    <code>foot.shtml.en</code></p>
  +
  +    <p>Both files are included into the error document by using the
  +    directives <code>&lt;!--#include virtual="head" --&gt;</code>
  +    and <code>&lt;!--#include virtual="foot" --&gt;</code>
  +    respectively: the rest of the magic occurs in mod_negotiation
  +    and in mod_include.</p>
  +
  +    <p>See <a href="#listings">the listings below</a> to see an
  +    actual HTML implementation of the discussed example.</p>
  +
  +    <h3><a id="createdocs" name="createdocs">Creating
  +    ErrorDocuments in different languages</a></h3>
  +    After all this preparation work, little remains to be said
  +    about the actual documents. They all share a simple common
  +    structure: 
  +<pre>
  +&lt;!--#set var="title" value="<em>error description title</em>" --&gt;
   &lt;!--#include virtual="head" --&gt;
  -   <EM>explanatory error text</EM>
  +   <em>explanatory error text</em>
   &lt;!--#include virtual="foot" --&gt;
  -</PRE>
  -In the <A HREF="#listings">listings section</A>, you can see an example
  -of a [400 Bad Request] error document. Documents as simple as that
  -certainly cause no problems to translate or expand.
  -
  -<H3><A NAME="fallback">The fallback language</A></H3>
  -
  -Do we need a special handling for languages other than those we have
  -translations for? We did set the LanguagePriority, didn't we?!
  -<P>
  -Well, the LanguagePriority directive is for the case where the client does
  -not express any language priority at all. But what
  -happens in the situation where the client wants one
  -of the languages we do not have, and none of those we do have?
  -<P>
  -Without doing anything, the Apache server will usually return a
  -[406 no acceptable variant] error, listing the choices from which the client
  -may select. But we're in an error message already, and important error
  -information might get lost when the client had to choose a language
  -representation first.
  -<P>
  -So, in this situation it appears to be easier to define a fallback language
  -(by copying or linking, <EM>e.g.</EM>, the english version to a language-less version).
  -Because the negotiation algorithm prefers "more specialized" variants over
  -"more generic" variants, these generic alternatives will only be chosen
  -when the normal negotiation did not succeed.
  -<P> 
  -A simple shell script to do it (execute within the errordocs/ dir):
  -<PRE>
  +</pre>
  +    In the <a href="#listings">listings section</a>, you can see an
  +    example of a [400 Bad Request] error document. Documents as
  +    simple as that certainly cause no problems to translate or
  +    expand. 
  +
  +    <h3><a id="fallback" name="fallback">The fallback
  +    language</a></h3>
  +    Do we need a special handling for languages other than those we
  +    have translations for? We did set the LanguagePriority, didn't
  +    we?! 
  +
  +    <p>Well, the LanguagePriority directive is for the case where
  +    the client does not express any language priority at all. But
  +    what happens in the situation where the client wants one of the
  +    languages we do not have, and none of those we do have?</p>
  +
  +    <p>Without doing anything, the Apache server will usually
  +    return a [406 no acceptable variant] error, listing the choices
  +    from which the client may select. But we're in an error message
  +    already, and important error information might get lost when
  +    the client had to choose a language representation first.</p>
  +
  +    <p>So, in this situation it appears to be easier to define a
  +    fallback language (by copying or linking, <em>e.g.</em>, the
  +    english version to a language-less version). Because the
  +    negotiation algorithm prefers "more specialized" variants over
  +    "more generic" variants, these generic alternatives will only
  +    be chosen when the normal negotiation did not succeed.</p>
  +
  +    <p>A simple shell script to do it (execute within the
  +    errordocs/ dir):</p>
  +<pre>
     for f in *.shtml.en
     do
        ln -s $f `basename $f .en`
     done
  -</PRE>
  -
  -<P>
  -</P>
  +</pre>
   
  -<H2><A NAME="proxy">Customizing Proxy Error Messages</A></H2>
  +    <h2><a id="proxy" name="proxy">Customizing Proxy Error
  +    Messages</a></h2>
   
  -<P>
  - As of Apache-1.3, it is possible to use the <CODE>ErrorDocument</CODE>
  - mechanism for proxy error messages as well (previous versions always
  - returned fixed predefined error messages).
  -</P>
  -<P>
  - Most proxy errors return an error code of [500 Internal Server Error].
  - To find out whether a particular error document was invoked on behalf
  - of a proxy error or because of some other server error, and what the reason
  - for the failure was, you can check the contents of the new
  - <CODE>ERROR_NOTES</CODE> CGI environment variable:
  - if invoked for a proxy error, this variable will contain the actual proxy
  - error message text in HTML form.
  -</P>
  -<P>
  - The following excerpt demonstrates how to exploit the <CODE>ERROR_NOTES</CODE>
  - variable within an error document:
  -</P>
  -<PRE>
  +    <p>As of Apache-1.3, it is possible to use the
  +    <code>ErrorDocument</code> mechanism for proxy error messages
  +    as well (previous versions always returned fixed predefined
  +    error messages).</p>
  +
  +    <p>Most proxy errors return an error code of [500 Internal
  +    Server Error]. To find out whether a particular error document
  +    was invoked on behalf of a proxy error or because of some other
  +    server error, and what the reason for the failure was, you can
  +    check the contents of the new <code>ERROR_NOTES</code> CGI
  +    environment variable: if invoked for a proxy error, this
  +    variable will contain the actual proxy error message text in
  +    HTML form.</p>
  +
  +    <p>The following excerpt demonstrates how to exploit the
  +    <code>ERROR_NOTES</code> variable within an error document:</p>
  +<pre>
    &lt;!--#if expr="$REDIRECT_ERROR_NOTES = ''" --&gt;
     &lt;p&gt;
      The server encountered an unexpected condition
  @@ -321,16 +373,18 @@
    &lt;!--#else --&gt;
     &lt;!--#echo var="REDIRECT_ERROR_NOTES" --&gt;
    &lt;!--#endif --&gt;
  -</PRE>
  +</pre>
   
  -<H2><A NAME="listings">HTML listing of the discussed example</A></H2>
  -
  -So, to summarize our example, here's the complete listing of the
  -<SAMP>400.shtml.en</SAMP> document. You will notice that it contains
  -almost nothing but the error text (with conditional additions).
  -Starting with this example, you will find it easy to add more error
  -documents, or to translate the error documents to different languages.
  -<HR><PRE>
  +    <h2><a id="listings" name="listings">HTML listing of the
  +    discussed example</a></h2>
  +    So, to summarize our example, here's the complete listing of
  +    the <samp>400.shtml.en</samp> document. You will notice that it
  +    contains almost nothing but the error text (with conditional
  +    additions). Starting with this example, you will find it easy
  +    to add more error documents, or to translate the error
  +    documents to different languages. 
  +    <hr />
  +<pre>
   &lt;!--#set var="title" value="Bad Request"
   --&gt;&lt;!--#include virtual="head" --&gt;&lt;P&gt;
      Your browser sent a request that this server could not understand:
  @@ -351,18 +405,21 @@
      &lt;!--#endif --&gt;
      &lt;/P&gt;
   &lt;!--#include virtual="foot" --&gt;
  -</PRE><HR>
  -
  -Here is the complete <SAMP>head.shtml</SAMP> file (the funny line
  -breaks avoid empty lines in the document after XSSI processing). Note the 
  -configuration section at top. That's where you configure the images and logos
  -as well as the apache documentation directory. Look how this file displays
  -two different logos depending on the content of the virtual host name
  -($SERVER_NAME), and that an animated apache logo is shown if the browser
  -appears to support it (the latter requires server configuration lines
  -of the form <BR><CODE>BrowserMatch "^Mozilla/[2-4]" anigif</CODE><BR>
  -for browser types which support animated GIFs).
  -<HR><PRE>
  +</pre>
  +    <hr />
  +    Here is the complete <samp>head.shtml</samp> file (the funny
  +    line breaks avoid empty lines in the document after XSSI
  +    processing). Note the configuration section at top. That's
  +    where you configure the images and logos as well as the apache
  +    documentation directory. Look how this file displays two
  +    different logos depending on the content of the virtual host
  +    name ($SERVER_NAME), and that an animated apache logo is shown
  +    if the browser appears to support it (the latter requires
  +    server configuration lines of the form <br />
  +     <code>BrowserMatch "^Mozilla/[2-4]" anigif</code><br />
  +     for browser types which support animated GIFs). 
  +    <hr />
  +<pre>
   &lt;!--#if expr="$SERVER_NAME = /.*\.mycompany\.com/" 
   --&gt;&lt;!--#set var="IMG_CorpLogo"
               value="http://$SERVER_NAME:$SERVER_PORT/errordocs/CorpLogo.gif" 
  @@ -394,10 +451,11 @@
     &lt;/H1&gt;
     &lt;HR&gt;&lt;!-- ======================================================== --&gt;
     &lt;DIV&gt;
  -</PRE><HR>
  - and this is the <SAMP>foot.shtml.en</SAMP> file:
  -<HR><PRE>
  -
  +</pre>
  +    <hr />
  +    and this is the <samp>foot.shtml.en</samp> file: 
  +    <hr />
  +<pre>
     &lt;/DIV&gt;
     &lt;HR&gt;
     &lt;DIV ALIGN="right"&gt;&lt;SMALL&gt;&lt;SUP&gt;Local Server time:
  @@ -419,15 +477,13 @@
     &lt;/ADDRESS&gt;
    &lt;/UL&gt;&lt;/BODY&gt;
   &lt;/HTML&gt;
  -</PRE><HR>
  -
  -
  -<H3>More welcome!</H3>
  -
  -If you have tips to contribute, send mail to <A
  -HREF="mailto:martin@apache.org">martin@apache.org</A>
  +</pre>
  +    <hr />
   
  -  <!--#include virtual="footer.html" -->
  -</BODY>
  -</HTML>
  +    <h3>More welcome!</h3>
  +    If you have tips to contribute, send mail to <a
  +    href="mailto:martin@apache.org">martin@apache.org</a> 
  +    <!--#include virtual="footer.html" -->
  +  </body>
  +</html>
   
  
  
  
  1.10      +193 -178  httpd-2.0/docs/manual/misc/descriptors.html
  
  Index: descriptors.html
  ===================================================================
  RCS file: /home/cvs/httpd-2.0/docs/manual/misc/descriptors.html,v
  retrieving revision 1.9
  retrieving revision 1.10
  diff -u -r1.9 -r1.10
  --- descriptors.html	2000/10/17 00:29:08	1.9
  +++ descriptors.html	2001/09/22 19:33:40	1.10
  @@ -1,182 +1,197 @@
  -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
  -<HTML>
  -<HEAD>
  -<TITLE>Descriptors and Apache</TITLE>
  -</HEAD>
  -
  -<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  -<BODY
  - BGCOLOR="#FFFFFF"
  - TEXT="#000000"
  - LINK="#0000FF"
  - VLINK="#000080"
  - ALINK="#FF0000"
  ->
  -<!--#include virtual="header.html" -->
  -<H1 ALIGN="CENTER">Descriptors and Apache</H1>
  -
  -<P>A <EM>descriptor</EM>, also commonly called a <EM>file handle</EM> is
  -an object that a program uses to read or write an open file, or open
  -network socket, or a variety of other devices.  It is represented
  -by an integer, and you may be familiar with <CODE>stdin</CODE>,
  -<CODE>stdout</CODE>, and <CODE>stderr</CODE> which are descriptors 0,
  -1, and 2 respectively.
  -Apache needs a descriptor for each log file, plus one for each
  -network socket that it listens on, plus a handful of others.  Libraries
  -that Apache uses may also require descriptors.  Normal programs don't
  -open up many descriptors at all, and so there are some latent problems
  -that you may experience should you start running Apache with many
  -descriptors (<EM>i.e.</EM>, with many virtual hosts).
  -
  -<P>The operating system enforces a limit on the number of descriptors
  -that a program can have open at a time.  There are typically three limits
  -involved here.  One is a kernel limitation, depending on your operating
  -system you will either be able to tune the number of descriptors available
  -to higher numbers (this is frequently called <EM>FD_SETSIZE</EM>).  Or you
  -may be stuck with a (relatively) low amount.  The second limit is called
  -the <EM>hard resource</EM> limit, and it is sometimes set by root in an
  -obscure operating system file, but frequently is the same as the kernel
  -limit.  The third limit is called the <EM>soft
  -resource</EM> limit.  The soft limit is always less than or equal to
  -the hard limit.  For example, the hard limit may be 1024, but the soft
  -limit only 64.  Any user can raise their soft limit up to the hard limit.
  -Root can raise the hard limit up to the system maximum limit.  The soft
  -limit is the actual limit that is used when enforcing the maximum number
  -of files a process can have open.
  +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  +    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
   
  -<P>To summarize:
  +<html xmlns="http://www.w3.org/1999/xhtml">
  +  <head>
  +    <meta name="generator" content="HTML Tidy, see www.w3.org" />
  +
  +    <title>Descriptors and Apache</title>
  +  </head>
  +  <!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  +
  +  <body bgcolor="#FFFFFF" text="#000000" link="#0000FF"
  +  vlink="#000080" alink="#FF0000">
  +    <!--#include virtual="header.html" -->
  +
  +    <h1 align="CENTER">Descriptors and Apache</h1>
  +
  +    <p>A <em>descriptor</em>, also commonly called a <em>file
  +    handle</em> is an object that a program uses to read or write
  +    an open file, or open network socket, or a variety of other
  +    devices. It is represented by an integer, and you may be
  +    familiar with <code>stdin</code>, <code>stdout</code>, and
  +    <code>stderr</code> which are descriptors 0, 1, and 2
  +    respectively. Apache needs a descriptor for each log file, plus
  +    one for each network socket that it listens on, plus a handful
  +    of others. Libraries that Apache uses may also require
  +    descriptors. Normal programs don't open up many descriptors at
  +    all, and so there are some latent problems that you may
  +    experience should you start running Apache with many
  +    descriptors (<em>i.e.</em>, with many virtual hosts).</p>
  +
  +    <p>The operating system enforces a limit on the number of
  +    descriptors that a program can have open at a time. There are
  +    typically three limits involved here. One is a kernel
  +    limitation, depending on your operating system you will either
  +    be able to tune the number of descriptors available to higher
  +    numbers (this is frequently called <em>FD_SETSIZE</em>). Or you
  +    may be stuck with a (relatively) low amount. The second limit
  +    is called the <em>hard resource</em> limit, and it is sometimes
  +    set by root in an obscure operating system file, but frequently
  +    is the same as the kernel limit. The third limit is called the
  +    <em>soft resource</em> limit. The soft limit is always less
  +    than or equal to the hard limit. For example, the hard limit
  +    may be 1024, but the soft limit only 64. Any user can raise
  +    their soft limit up to the hard limit. Root can raise the hard
  +    limit up to the system maximum limit. The soft limit is the
  +    actual limit that is used when enforcing the maximum number of
  +    files a process can have open.</p>
   
  -<CENTER><PRE>
  +    <p>To summarize:</p>
  +
  +    <center>
  +<pre>
     #open files  &lt;=  soft limit  &lt;=  hard limit  &lt;=  kernel limit
  -</PRE></CENTER>
  +</pre>
  +    </center>
  +
  +    <p>You control the hard and soft limits using the
  +    <code>limit</code> (csh) or <code>ulimit</code> (sh)
  +    directives. See the respective man pages for more information.
  +    For example you can probably use <code>ulimit -n
  +    unlimited</code> to raise your soft limit up to the hard limit.
  +    You should include this command in a shell script which starts
  +    your webserver.</p>
  +
  +    <p>Unfortunately, it's not always this simple. As mentioned
  +    above, you will probably run into some system limitations that
  +    will need to be worked around somehow. Work was done in version
  +    1.2.1 to improve the situation somewhat. Here is a partial list
  +    of systems and workarounds (assuming you are using 1.2.1 or
  +    later):</p>
  +
  +    <dl>
  +      <dt><strong>BSDI 2.0</strong></dt>
  +
  +      <dd>Under BSDI 2.0 you can build Apache to support more
  +      descriptors by adding <code>-DFD_SETSIZE=nnn</code> to
  +      <code>EXTRA_CFLAGS</code> (where nnn is the number of
  +      descriptors you wish to support, keep it less than the hard
  +      limit). But it will run into trouble if more than
  +      approximately 240 Listen directives are used. This may be
  +      cured by rebuilding your kernel with a higher
  +      FD_SETSIZE.</dd>
  +
  +      <dt><strong>FreeBSD 2.2, BSDI 2.1+</strong></dt>
  +
  +      <dd>Similar to the BSDI 2.0 case, you should define
  +      <code>FD_SETSIZE</code> and rebuild. But the extra Listen
  +      limitation doesn't exist.</dd>
  +
  +      <dt><strong>Linux</strong></dt>
  +
  +      <dd>By default Linux has a kernel maximum of 256 open
  +      descriptors per process. There are several patches available
  +      for the 2.0.x series which raise this to 1024 and beyond, and
  +      you can find them in the "unofficial patches" section of <a
  +      href="http://www.linuxhq.com/">the Linux Information HQ</a>.
  +      None of these patches are perfect, and an entirely different
  +      approach is likely to be taken during the 2.1.x development.
  +      Applying these patches will raise the FD_SETSIZE used to
  +      compile all programs, and unless you rebuild all your
  +      libraries you should avoid running any other program with a
  +      soft descriptor limit above 256. As of this writing the
  +      patches available for increasing the number of descriptors do
  +      not take this into account. On a dedicated webserver you
  +      probably won't run into trouble.</dd>
  +
  +      <dt><strong>Solaris through 2.5.1</strong></dt>
  +
  +      <dd>Solaris has a kernel hard limit of 1024 (may be lower in
  +      earlier versions). But it has a limitation that files using
  +      the stdio library cannot have a descriptor above 255. Apache
  +      uses the stdio library for the ErrorLog directive. When you
  +      have more than approximately 110 virtual hosts (with an error
  +      log and an access log each) you will need to build Apache
  +      with <code>-DHIGH_SLACK_LINE=256</code> added to
  +      <code>EXTRA_CFLAGS</code>. You will be limited to
  +      approximately 240 error logs if you do this.</dd>
  +
  +      <dt><strong>AIX</strong></dt>
  +
  +      <dd>AIX version 3.2?? appears to have a hard limit of 128
  +      descriptors. End of story. Version 4.1.5 has a hard limit of
  +      2000.</dd>
  +
  +      <dt><strong>SCO OpenServer</strong></dt>
  +
  +      <dd>Edit the <code>/etc/conf/cf.d/stune</code> file or use
  +      <code>/etc/conf/cf.d/configure</code> choice 7 (User and
  +      Group configuration) and modify the <code>NOFILES</code>
  +      kernel parameter to a suitably higher value. SCO recommends a
  +      number between 60 and 11000, the default is 110. Relink and
  +      reboot, and the new number of descriptors will be
  +      available.</dd>
  +
  +      <dt><strong>Compaq Tru64 UNIX/Digital UNIX/OSF</strong></dt>
  +
  +      <dd>
  +        <ol>
  +          <li>Raise <code>open_max_soft</code> and
  +          <code>open_max_hard</code> to 4096 in the proc subsystem.
  +          Do a man on sysconfig, sysconfigdb, and
  +          sysconfigtab.</li>
  +
  +          <li>Raise <code>max-vnodes</code> to a large number which
  +          is greater than the number of apache processes * 4096
  +          (Setting it to 250,000 should be good for most people).
  +          Do a man on sysconfig, sysconfigdb, and
  +          sysconfigtab.</li>
  +
  +          <li>If you are using Tru64 5.0, 5.0A, or 5.1, define
  +          <code>NO_SLACK</code> to work around a bug in the OS.
  +          <code>CFLAGS="-DNO_SLACK" ./configure</code></li>
  +        </ol>
  +      </dd>
  +
  +      <dt><strong>Others</strong></dt>
  +
  +      <dd>If you have details on another operating system, please
  +      submit it through our <a
  +      href="http://www.apache.org/bug_report.html">Bug Report
  +      Page</a>.</dd>
  +    </dl>
  +
  +    <p>In addition to the problems described above there are
  +    problems with many libraries that Apache uses. The most common
  +    example is the bind DNS resolver library that is used by pretty
  +    much every unix, which fails if it ends up with a descriptor
  +    above 256. We suspect there are other libraries that similar
  +    limitations. So the code as of 1.2.1 takes a defensive stance
  +    and tries to save descriptors less than 16 for use while
  +    processing each request. This is called the <em>low slack
  +    line</em>.</p>
  +
  +    <p>Note that this shouldn't waste descriptors. If you really
  +    are pushing the limits and Apache can't get a descriptor above
  +    16 when it wants it, it will settle for one below 16.</p>
  +
  +    <p>In extreme situations you may want to lower the low slack
  +    line, but you shouldn't ever need to. For example, lowering it
  +    can increase the limits 240 described above under Solaris and
  +    BSDI 2.0. But you'll play a delicate balancing game with the
  +    descriptors needed to serve a request. Should you want to play
  +    this game, the compile time parameter is
  +    <code>LOW_SLACK_LINE</code> and there's a tiny bit of
  +    documentation in the header file <code>httpd.h</code>.</p>
  +
  +    <p>Finally, if you suspect that all this slack stuff is causing
  +    you problems, you can disable it. Add <code>-DNO_SLACK</code>
  +    to <code>EXTRA_CFLAGS</code> and rebuild. But please report it
  +    to our <a href="http://www.apache.org/bug_report.html">Bug
  +    Report Page</a> so that we can investigate. 
  +    <!--#include virtual="footer.html" -->
  +    </p>
  +  </body>
  +</html>
   
  -<P>You control the hard and soft limits using the <CODE>limit</CODE> (csh)
  -or <CODE>ulimit</CODE> (sh) directives.  See the respective man pages
  -for more information.  For example you can probably use
  -<CODE>ulimit -n unlimited</CODE> to raise your soft limit up to the
  -hard limit.  You should include this command in a shell script which
  -starts your webserver.
  -
  -<P>Unfortunately, it's not always this simple.  As mentioned above,
  -you will probably run into some system limitations that will need to be
  -worked around somehow.  Work was done in version 1.2.1 to improve the
  -situation somewhat.  Here is a partial list of systems and workarounds
  -(assuming you are using 1.2.1 or later):
  -
  -<DL>
  -
  -    <DT><STRONG>BSDI 2.0</STRONG>
  -    <DD>Under BSDI 2.0 you can build Apache to support more descriptors
  -        by adding <CODE>-DFD_SETSIZE=nnn</CODE> to
  -        <CODE>EXTRA_CFLAGS</CODE> (where nnn is the number of descriptors
  -        you wish to support, keep it less than the hard limit).  But it
  -        will run into trouble if more than approximately 240 Listen
  -        directives are used.  This may be cured by rebuilding your kernel
  -        with a higher FD_SETSIZE.
  -    <P>
  -
  -    <DT><STRONG>FreeBSD 2.2, BSDI 2.1+</STRONG>
  -    <DD>Similar to the BSDI 2.0 case, you should define
  -        <CODE>FD_SETSIZE</CODE> and rebuild.  But the extra
  -        Listen limitation doesn't exist.
  -    <P>
  -
  -    <DT><STRONG>Linux</STRONG>
  -    <DD>By default Linux has a kernel maximum of 256 open descriptors
  -        per process.  There are several patches available for the
  -        2.0.x series which raise this to 1024 and beyond, and you
  -        can find them in the "unofficial patches" section of <A
  -        HREF="http://www.linuxhq.com/">the Linux Information HQ</A>.
  -        None of these patches are perfect, and an entirely different
  -        approach is likely to be taken during the 2.1.x development.
  -        Applying these patches will raise the FD_SETSIZE used to compile
  -        all programs, and unless you rebuild all your libraries you should
  -        avoid running any other program with a soft descriptor limit above
  -        256.  As of this writing the patches available for increasing
  -        the number of descriptors do not take this into account.  On a
  -        dedicated webserver you probably won't run into trouble.
  -    <P>
  -
  -    <DT><STRONG>Solaris through 2.5.1</STRONG>
  -    <DD>Solaris has a kernel hard limit of 1024 (may be lower in earlier
  -        versions).  But it has a limitation that files using
  -        the stdio library cannot have a descriptor above 255.
  -        Apache uses the stdio library for the ErrorLog directive.
  -        When you have more than approximately 110 virtual hosts
  -        (with an error log and an access log each) you will need to
  -        build Apache with <CODE>-DHIGH_SLACK_LINE=256</CODE> added to
  -        <CODE>EXTRA_CFLAGS</CODE>.  You will be limited to approximately
  -        240 error logs if you do this.
  -    <P>
  -
  -    <DT><STRONG>AIX</STRONG>
  -    <DD>AIX version 3.2?? appears to have a hard limit of 128 descriptors.
  -	End of story.  Version 4.1.5 has a hard limit of 2000.
  -    <P>
  -
  -    <DT><STRONG>SCO OpenServer</STRONG> 
  -    <DD>Edit the
  -    <CODE>/etc/conf/cf.d/stune</CODE> file or use 
  -    <CODE>/etc/conf/cf.d/configure</CODE> choice 7
  -    (User and Group configuration) and modify the <CODE>NOFILES</CODE> kernel
  -    parameter to a suitably higher value.  SCO recommends a number
  -    between 60 and 11000, the default is 110.  Relink and reboot, 
  -    and the new number of descriptors will be available.
  -
  -    <P>
  -
  -    <DT><STRONG>Compaq Tru64 UNIX/Digital UNIX/OSF</STRONG>
  -    <DD><OL>
  -    <LI>Raise <code>open_max_soft</code> and <code>open_max_hard</code>
  -        to 4096 in the proc subsystem.
  -        Do a man on sysconfig, sysconfigdb, and sysconfigtab.
  -    <LI>Raise <code>max-vnodes</code> to a large number which is greater 
  -        than the number of apache processes * 4096
  -        (Setting it to 250,000 should be good for most people).
  -        Do a man on sysconfig, sysconfigdb, and sysconfigtab.
  -    <LI>If you are using Tru64 5.0, 5.0A, or 5.1, define 
  -        <code>NO_SLACK</code> to work around a bug in the OS.
  -        <code>CFLAGS="-DNO_SLACK" ./configure</code>
  -    </OL>
  -
  -    <P>
  -
  -    <DT><STRONG>Others</STRONG>
  -    <DD>If you have details on another operating system, please submit
  -        it through our <A HREF="http://www.apache.org/bug_report.html">Bug
  -        Report Page</A>.
  -    <P>
  -
  -</DL>
  -
  -<P>In addition to the problems described above there are problems with
  -many libraries that Apache uses.  The most common example is the bind
  -DNS resolver library that is used by pretty much every unix, which
  -fails if it ends up with a descriptor above 256.  We suspect there
  -are other libraries that similar limitations.  So the code as of 1.2.1
  -takes a defensive stance and tries to save descriptors less than 16
  -for use while processing each request.  This is called the <EM>low
  -slack line</EM>.
  -
  -<P>Note that this shouldn't waste descriptors.  If you really are pushing
  -the limits and Apache can't get a descriptor above 16 when it wants
  -it, it will settle for one below 16.
  -
  -<P>In extreme situations you may want to lower the low slack line,
  -but you shouldn't ever need to.  For example, lowering it can
  -increase the limits 240 described above under Solaris and BSDI 2.0.
  -But you'll play a delicate balancing game with the descriptors needed
  -to serve a request.  Should you want to play this game, the compile
  -time parameter is <CODE>LOW_SLACK_LINE</CODE> and there's a tiny
  -bit of documentation in the header file <CODE>httpd.h</CODE>.
  -
  -<P>Finally, if you suspect that all this slack stuff is causing you
  -problems, you can disable it.  Add <CODE>-DNO_SLACK</CODE> to
  -<CODE>EXTRA_CFLAGS</CODE> and rebuild.  But please report it to
  -our <A HREF="http://www.apache.org/bug_report.html">Bug
  -Report Page</A> so that
  -we can investigate.
  -
  -<!--#include virtual="footer.html" -->
  -</BODY>
  -</HTML>
  
  
  
  1.18      +399 -324  httpd-2.0/docs/manual/misc/fin_wait_2.html
  
  Index: fin_wait_2.html
  ===================================================================
  RCS file: /home/cvs/httpd-2.0/docs/manual/misc/fin_wait_2.html,v
  retrieving revision 1.17
  retrieving revision 1.18
  diff -u -r1.17 -r1.18
  --- fin_wait_2.html	2001/03/28 21:26:29	1.17
  +++ fin_wait_2.html	2001/09/22 19:33:40	1.18
  @@ -1,324 +1,399 @@
  -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
  -<HTML>
  -<HEAD>
  -<TITLE>Connections in FIN_WAIT_2 and Apache</TITLE>
  -<LINK REV="made" HREF="mailto:marc@apache.org">
  -
  -</HEAD>
  -
  -<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  -<BODY
  - BGCOLOR="#FFFFFF"
  - TEXT="#000000"
  - LINK="#0000FF"
  - VLINK="#000080"
  - ALINK="#FF0000"
  ->
  -<!--#include virtual="header.html" -->
  -
  -<H1 ALIGN="CENTER">Connections in the FIN_WAIT_2 state and Apache</H1>
  -<OL>
  -<LI><H2>What is the FIN_WAIT_2 state?</H2>
  -Starting with the Apache 1.2 betas, people are reporting many more
  -connections in the FIN_WAIT_2 state (as reported by
  -<CODE>netstat</CODE>) than they saw using older versions.  When the
  -server closes a TCP connection, it sends a packet with the FIN bit
  -sent to the client, which then responds with a packet with the ACK bit
  -set.  The client then sends a packet with the FIN bit set to the
  -server, which responds with an ACK and the connection is closed.  The
  -state that the connection is in during the period between when the
  -server gets the ACK from the client and the server gets the FIN from
  -the client is known as FIN_WAIT_2.  See the <A
  -HREF="ftp://ds.internic.net/rfc/rfc793.txt">TCP RFC</A> for the
  -technical details of the state transitions.<P>
  -
  -The FIN_WAIT_2 state is somewhat unusual in that there is no timeout
  -defined in the standard for it.  This means that on many operating
  -systems, a connection in the FIN_WAIT_2 state will stay around until
  -the system is rebooted.  If the system does not have a timeout and
  -too many FIN_WAIT_2 connections build up, it can fill up the space
  -allocated for storing information about the connections and crash
  -the kernel.  The connections in FIN_WAIT_2 do not tie up an httpd
  -process.<P>
  -
  -<LI><H2>But why does it happen?</H2>
  -
  -There are numerous reasons for it happening, some of them may not
  -yet be fully clear.  What is known follows.<P>
  -
  -<H3>Buggy clients and persistent connections</H3>
  -
  -Several clients have a bug which pops up when dealing with
  -<A HREF="../keepalive.html">persistent connections</A> (aka keepalives).
  -When the connection is idle and the server closes the connection
  -(based on the <A HREF="../mod/core.html#keepalivetimeout">
  -KeepAliveTimeout</A>), the client is programmed so that the client does
  -not send back a FIN and ACK to the server.  This means that the
  -connection stays in the FIN_WAIT_2 state until one of the following
  -happens:<P>
  -<UL>
  -        <LI>The client opens a new connection to the same or a different
  -            site, which causes it to fully close the older connection on
  -            that socket.
  -        <LI>The user exits the client, which on some (most?) clients
  -            causes the OS to fully shutdown the connection.
  -        <LI>The FIN_WAIT_2 times out, on servers that have a timeout
  -            for this state.
  -</UL><P>
  -If you are lucky, this means that the buggy client will fully close the
  -connection and release the resources on your server.  However, there
  -are some cases where the socket is never fully closed, such as a dialup
  -client disconnecting from their provider before closing the client.
  -In addition, a client might sit idle for days without making another
  -connection, and thus may hold its end of the socket open for days
  -even though it has no further use for it.
  -<STRONG>This is a bug in the browser or in its operating system's
  -TCP implementation.</STRONG>  <P>
  -
  -The clients on which this problem has been verified to exist:<P>
  -<UL>
  -        <LI>Mozilla/3.01 (X11; I; FreeBSD 2.1.5-RELEASE i386)
  -        <LI>Mozilla/2.02 (X11; I; FreeBSD 2.1.5-RELEASE i386)
  -        <LI>Mozilla/3.01Gold (X11; I; SunOS 5.5 sun4m)
  -        <LI>MSIE 3.01 on the Macintosh
  -        <LI>MSIE 3.01 on Windows 95
  -</UL><P>
  -
  -This does not appear to be a problem on:
  -<UL>
  -        <LI>Mozilla/3.01 (Win95; I)
  -</UL>
  -<P>
  -
  -It is expected that many other clients have the same problem. What a
  -client <STRONG>should do</STRONG> is periodically check its open
  -socket(s) to see if they have been closed by the server, and close their
  -side of the connection if the server has closed.  This check need only
  -occur once every few seconds, and may even be detected by a OS signal
  -on some systems (<EM>e.g.</EM>, Win95 and NT clients have this capability, but
  -they seem to be ignoring it).<P>
  -
  -Apache <STRONG>cannot</STRONG> avoid these FIN_WAIT_2 states unless it
  -disables persistent connections for the buggy clients, just
  -like we recommend doing for Navigator 2.x clients due to other bugs.
  -However, non-persistent connections increase the total number of
  -connections needed per client and slow retrieval of an image-laden
  -web page.  Since non-persistent connections have their own resource
  -consumptions and a short waiting period after each closure, a busy server
  -may need persistence in order to best serve its clients.<P>
  -
  -As far as we know, the client-caused FIN_WAIT_2 problem is present for
  -all servers that support persistent connections, including Apache 1.1.x
  -and 1.2.<P>
  -
  -<H3>A necessary bit of code introduced in 1.2</H3>
  -
  -While the above bug is a problem, it is not the whole problem.
  -Some users have observed no FIN_WAIT_2 problems with Apache 1.1.x,
  -but with 1.2b enough connections build up in the FIN_WAIT_2 state to
  -crash their server.  
  -
  -The most likely source for additional FIN_WAIT_2 states
  -is a function called <CODE>lingering_close()</CODE> which was added
  -between 1.1 and 1.2.  This function is necessary for the proper
  -handling of persistent connections and any request which includes
  -content in the message body (<EM>e.g.</EM>, PUTs and POSTs).
  -What it does is read any data sent by the client for
  -a certain time after the server closes the connection.  The exact
  -reasons for doing this are somewhat complicated, but involve what
  -happens if the client is making a request at the same time the
  -server sends a response and closes the connection. Without lingering,
  -the client might be forced to reset its TCP input buffer before it
  -has a chance to read the server's response, and thus understand why
  -the connection has closed.
  -See the <A HREF="#appendix">appendix</A> for more details.<P>
  -
  -The code in <CODE>lingering_close()</CODE> appears to cause problems
  -for a number of factors, including the change in traffic patterns
  -that it causes.  The code has been thoroughly reviewed and we are
  -not aware of any bugs in it.  It is possible that there is some
  -problem in the BSD TCP stack, aside from the lack of a timeout
  -for the FIN_WAIT_2 state, exposed by the <CODE>lingering_close</CODE>
  -code that causes the observed problems.<P>
  -
  -<H2><LI>What can I do about it?</H2>
  -
  -There are several possible workarounds to the problem, some of
  -which work better than others.<P>
  -
  -<H3>Add a timeout for FIN_WAIT_2</H3>
  -
  -The obvious workaround is to simply have a timeout for the FIN_WAIT_2 state.
  -This is not specified by the RFC, and could be claimed to be a
  -violation of the RFC, but it is widely recognized as being necessary.
  -The following systems are known to have a timeout:
  -<P>
  -<UL>
  -        <LI><A HREF="http://www.freebsd.org/">FreeBSD</A> versions starting at
  -            2.0 or possibly earlier.
  -        <LI><A HREF="http://www.netbsd.org/">NetBSD</A> version 1.2(?)
  -        <LI><A HREF="http://www.openbsd.org/">OpenBSD</A> all versions(?)
  -        <LI><A HREF="http://www.bsdi.com/">BSD/OS</A> 2.1, with the
  -            <A HREF="ftp://ftp.bsdi.com/bsdi/patches/patches-2.1/K210-027">
  -            K210-027</A> patch installed.
  -        <LI><A HREF="http://www.sun.com/">Solaris</A> as of around version
  -            2.2.  The timeout can be tuned by using <CODE>ndd</CODE> to
  -            modify <CODE>tcp_fin_wait_2_flush_interval</CODE>, but the
  -            default should be appropriate for most servers and improper
  -            tuning can have negative impacts.
  -        <LI><A HREF="http://www.linux.org/">Linux</A> 2.0.x and
  -            earlier(?)
  -        <LI><A HREF="http://www.hp.com/">HP-UX</A> 10.x defaults to
  -            terminating connections in the FIN_WAIT_2 state after the
  -            normal keepalive timeouts.  This does not
  -            refer to the persistent connection or HTTP keepalive
  -            timeouts, but the <CODE>SO_LINGER</CODE> socket option
  -            which is enabled by Apache.  This parameter can be adjusted
  -            by using <CODE>nettune</CODE> to modify parameters such as
  -            <CODE>tcp_keepstart</CODE> and <CODE>tcp_keepstop</CODE>.
  -            In later revisions, there is an explicit timer for
  -            connections in FIN_WAIT_2 that can be modified; contact HP
  -            support for details.
  -        <LI><A HREF="http://www.sgi.com/">SGI IRIX</A> can be patched to
  -            support a timeout.  For IRIX 5.3, 6.2, and 6.3,
  -            use patches 1654, 1703 and 1778 respectively.  If you
  -            have trouble locating these patches, please contact your
  -            SGI support channel for help.
  -        <LI><A HREF="http://www.ncr.com/">NCR's MP RAS Unix</A> 2.xx and
  -            3.xx both have FIN_WAIT_2 timeouts.  In 2.xx it is non-tunable
  -            at 600 seconds, while in 3.xx it defaults to 600 seconds and
  -            is calculated based on the tunable "max keep alive probes"
  -            (default of 8) multiplied by the "keep alive interval" (default
  -            75 seconds).
  -        <LI><A HREF="http://www.sequent.com">Sequent's ptx/TCP/IP for
  -            DYNIX/ptx</A> has had a FIN_WAIT_2 timeout since around
  -            release 4.1 in mid-1994.
  -</UL>
  -<P>
  -The following systems are known to not have a timeout:
  -<P>
  -<UL>
  -        <LI><A HREF="http://www.sun.com/">SunOS 4.x</A> does not and
  -            almost certainly never will have one because it as at the
  -            very end of its development cycle for Sun.  If you have kernel
  -            source should be easy to patch.
  -</UL>
  -<P>
  -There is a
  -<A HREF="http://www.apache.org/dist/httpd/contrib/patches/1.2/fin_wait_2.patch"
  ->patch available</A> for adding a timeout to the FIN_WAIT_2 state; it
  -was originally intended for BSD/OS, but should be adaptable to most
  -systems using BSD networking code.  You need kernel source code to be
  -able to use it.  If you do adapt it to work for any other systems,
  -please drop me a note at <A HREF="mailto:marc@apache.org">marc@apache.org</A>.
  -<P>
  -<H3>Compile without using <CODE>lingering_close()</CODE></H3>
  -
  -It is possible to compile Apache 1.2 without using the
  -<CODE>lingering_close()</CODE> function.  This will result in that
  -section of code being similar to that which was in 1.1.  If you do
  -this, be aware that it can cause problems with PUTs, POSTs and
  -persistent connections, especially if the client uses pipelining.
  -That said, it is no worse than on 1.1, and we understand that keeping your
  -server running is quite important.<P>
  -
  -To compile without the <CODE>lingering_close()</CODE> function, add
  -<CODE>-DNO_LINGCLOSE</CODE> to the end of the
  -<CODE>EXTRA_CFLAGS</CODE> line in your <CODE>Configuration</CODE> file,
  -rerun <CODE>Configure</CODE> and rebuild the server.
  -<P>
  -<H3>Use <CODE>SO_LINGER</CODE> as an alternative to
  -<CODE>lingering_close()</CODE></H3>
  -
  -On most systems, there is an option called <CODE>SO_LINGER</CODE> that
  -can be set with <CODE>setsockopt(2)</CODE>.  It does something very
  -similar to <CODE>lingering_close()</CODE>, except that it is broken
  -on many systems so that it causes far more problems than
  -<CODE>lingering_close</CODE>.  On some systems, it could possibly work
  -better so it may be worth a try if you have no other alternatives. <P>
  -
  -To try it, add <CODE>-DUSE_SO_LINGER -DNO_LINGCLOSE</CODE>  to the end of the
  -<CODE>EXTRA_CFLAGS</CODE> line in your <CODE>Configuration</CODE>
  -file, rerun <CODE>Configure</CODE> and rebuild the server.  <P>
  -
  -<STRONG>NOTE:</STRONG> Attempting to use <CODE>SO_LINGER</CODE> and
  -<CODE>lingering_close()</CODE> at the same time is very likely to do
  -very bad things, so don't.<P>
  -
  -<H3>Increase the amount of memory used for storing connection state</H3>
  -<DL>
  -<DT>BSD based networking code:
  -<DD>BSD stores network data, such as connection states,
  -in something called an mbuf.  When you get so many connections
  -that the kernel does not have enough mbufs to put them all in, your
  -kernel will likely crash.  You can reduce the effects of the problem
  -by increasing the number of mbufs that are available; this will not
  -prevent the problem, it will just make the server go longer before
  -crashing.<P>
  -
  -The exact way to increase them may depend on your OS; look
  -for some reference to the number of "mbufs" or "mbuf clusters".  On
  -many systems, this can be done by adding the line
  -<CODE>NMBCLUSTERS="n"</CODE>, where <CODE>n</CODE> is the number of
  -mbuf clusters you want to your kernel config file and rebuilding your
  -kernel.<P>
  -</DL>
  -
  -<H3>Disable KeepAlive</H3>
  -<P>If you are unable to do any of the above then you should, as a last
  -resort, disable KeepAlive.  Edit your httpd.conf and change "KeepAlive On"
  -to "KeepAlive Off".
  -
  -<H2><LI>Feedback</H2>
  -
  -If you have any information to add to this page, please contact me at
  -<A HREF="mailto:marc@apache.org">marc@apache.org</A>.<P>
  -
  -<H2><A NAME="appendix"><LI>Appendix</A></H2>
  -<P>
  -Below is a message from Roy Fielding, one of the authors of HTTP/1.1.
  -
  -<H3>Why the lingering close functionality is necessary with HTTP</H3>
  -
  -The need for a server to linger on a socket after a close is noted a couple
  -times in the HTTP specs, but not explained.  This explanation is based on
  -discussions between myself, Henrik Frystyk, Robert S. Thau, Dave Raggett,
  -and John C. Mallery in the hallways of MIT while I was at W3C.<P>
  -
  -If a server closes the input side of the connection while the client
  -is sending data (or is planning to send data), then the server's TCP
  -stack will signal an RST (reset) back to the client.  Upon
  -receipt of the RST, the client will flush its own incoming TCP buffer
  -back to the un-ACKed packet indicated by the RST packet argument.
  -If the server has sent a message, usually an error response, to the
  -client just before the close, and the client receives the RST packet
  -before its application code has read the error message from its incoming
  -TCP buffer and before the server has received the ACK sent by the client
  -upon receipt of that buffer, then the RST will flush the error message
  -before the client application has a chance to see it. The result is
  -that the client is left thinking that the connection failed for no
  -apparent reason.<P>
  -
  -There are two conditions under which this is likely to occur:
  -<OL>
  -<LI>sending POST or PUT data without proper authorization
  -<LI>sending multiple requests before each response (pipelining)
  -    and one of the middle requests resulting in an error or
  -    other break-the-connection result.
  -</OL>
  -<P>
  -The solution in all cases is to send the response, close only the
  -write half of the connection (what shutdown is supposed to do), and
  -continue reading on the socket until it is either closed by the
  -client (signifying it has finally read the response) or a timeout occurs.
  -That is what the kernel is supposed to do if SO_LINGER is set.
  -Unfortunately, SO_LINGER has no effect on some systems; on some other
  -systems, it does not have its own timeout and thus the TCP memory
  -segments just pile-up until the next reboot (planned or not).<P>
  -
  -Please note that simply removing the linger code will not solve the
  -problem -- it only moves it to a different and much harder one to detect.
  -</OL>
  -<!--#include virtual="footer.html" -->
  -</BODY>
  -</HTML>
  +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  +    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  +
  +<html xmlns="http://www.w3.org/1999/xhtml">
  +  <head>
  +    <meta name="generator" content="HTML Tidy, see www.w3.org" />
  +
  +    <title>Connections in FIN_WAIT_2 and Apache</title>
  +    <link rev="made" href="mailto:marc@apache.org" />
  +  </head>
  +  <!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  +
  +  <body bgcolor="#FFFFFF" text="#000000" link="#0000FF"
  +  vlink="#000080" alink="#FF0000">
  +    <!--#include virtual="header.html" -->
  +
  +    <h1 align="CENTER">Connections in the FIN_WAIT_2 state and
  +    Apache</h1>
  +
  +    <ol>
  +      <li>
  +        <h2>What is the FIN_WAIT_2 state?</h2>
  +        Starting with the Apache 1.2 betas, people are reporting
  +        many more connections in the FIN_WAIT_2 state (as reported
  +        by <code>netstat</code>) than they saw using older
  +        versions. When the server closes a TCP connection, it sends
  +        a packet with the FIN bit sent to the client, which then
  +        responds with a packet with the ACK bit set. The client
  +        then sends a packet with the FIN bit set to the server,
  +        which responds with an ACK and the connection is closed.
  +        The state that the connection is in during the period
  +        between when the server gets the ACK from the client and
  +        the server gets the FIN from the client is known as
  +        FIN_WAIT_2. See the <a
  +        href="ftp://ds.internic.net/rfc/rfc793.txt">TCP RFC</a> for
  +        the technical details of the state transitions. 
  +
  +        <p>The FIN_WAIT_2 state is somewhat unusual in that there
  +        is no timeout defined in the standard for it. This means
  +        that on many operating systems, a connection in the
  +        FIN_WAIT_2 state will stay around until the system is
  +        rebooted. If the system does not have a timeout and too
  +        many FIN_WAIT_2 connections build up, it can fill up the
  +        space allocated for storing information about the
  +        connections and crash the kernel. The connections in
  +        FIN_WAIT_2 do not tie up an httpd process.</p>
  +      </li>
  +
  +      <li>
  +        <h2>But why does it happen?</h2>
  +        There are numerous reasons for it happening, some of them
  +        may not yet be fully clear. What is known follows. 
  +
  +        <h3>Buggy clients and persistent connections</h3>
  +        Several clients have a bug which pops up when dealing with
  +        <a href="../keepalive.html">persistent connections</a> (aka
  +        keepalives). When the connection is idle and the server
  +        closes the connection (based on the <a
  +        href="../mod/core.html#keepalivetimeout">KeepAliveTimeout</a>),
  +        the client is programmed so that the client does not send
  +        back a FIN and ACK to the server. This means that the
  +        connection stays in the FIN_WAIT_2 state until one of the
  +        following happens: 
  +
  +        <ul>
  +          <li>The client opens a new connection to the same or a
  +          different site, which causes it to fully close the older
  +          connection on that socket.</li>
  +
  +          <li>The user exits the client, which on some (most?)
  +          clients causes the OS to fully shutdown the
  +          connection.</li>
  +
  +          <li>The FIN_WAIT_2 times out, on servers that have a
  +          timeout for this state.</li>
  +        </ul>
  +
  +        <p>If you are lucky, this means that the buggy client will
  +        fully close the connection and release the resources on
  +        your server. However, there are some cases where the socket
  +        is never fully closed, such as a dialup client
  +        disconnecting from their provider before closing the
  +        client. In addition, a client might sit idle for days
  +        without making another connection, and thus may hold its
  +        end of the socket open for days even though it has no
  +        further use for it. <strong>This is a bug in the browser or
  +        in its operating system's TCP implementation.</strong></p>
  +
  +        <p>The clients on which this problem has been verified to
  +        exist:</p>
  +
  +        <ul>
  +          <li>Mozilla/3.01 (X11; I; FreeBSD 2.1.5-RELEASE
  +          i386)</li>
  +
  +          <li>Mozilla/2.02 (X11; I; FreeBSD 2.1.5-RELEASE
  +          i386)</li>
  +
  +          <li>Mozilla/3.01Gold (X11; I; SunOS 5.5 sun4m)</li>
  +
  +          <li>MSIE 3.01 on the Macintosh</li>
  +
  +          <li>MSIE 3.01 on Windows 95</li>
  +        </ul>
  +
  +        <p>This does not appear to be a problem on:</p>
  +
  +        <ul>
  +          <li>Mozilla/3.01 (Win95; I)</li>
  +        </ul>
  +
  +        <p>It is expected that many other clients have the same
  +        problem. What a client <strong>should do</strong> is
  +        periodically check its open socket(s) to see if they have
  +        been closed by the server, and close their side of the
  +        connection if the server has closed. This check need only
  +        occur once every few seconds, and may even be detected by a
  +        OS signal on some systems (<em>e.g.</em>, Win95 and NT
  +        clients have this capability, but they seem to be ignoring
  +        it).</p>
  +
  +        <p>Apache <strong>cannot</strong> avoid these FIN_WAIT_2
  +        states unless it disables persistent connections for the
  +        buggy clients, just like we recommend doing for Navigator
  +        2.x clients due to other bugs. However, non-persistent
  +        connections increase the total number of connections needed
  +        per client and slow retrieval of an image-laden web page.
  +        Since non-persistent connections have their own resource
  +        consumptions and a short waiting period after each closure,
  +        a busy server may need persistence in order to best serve
  +        its clients.</p>
  +
  +        <p>As far as we know, the client-caused FIN_WAIT_2 problem
  +        is present for all servers that support persistent
  +        connections, including Apache 1.1.x and 1.2.</p>
  +
  +        <h3>A necessary bit of code introduced in 1.2</h3>
  +        While the above bug is a problem, it is not the whole
  +        problem. Some users have observed no FIN_WAIT_2 problems
  +        with Apache 1.1.x, but with 1.2b enough connections build
  +        up in the FIN_WAIT_2 state to crash their server. The most
  +        likely source for additional FIN_WAIT_2 states is a
  +        function called <code>lingering_close()</code> which was
  +        added between 1.1 and 1.2. This function is necessary for
  +        the proper handling of persistent connections and any
  +        request which includes content in the message body
  +        (<em>e.g.</em>, PUTs and POSTs). What it does is read any
  +        data sent by the client for a certain time after the server
  +        closes the connection. The exact reasons for doing this are
  +        somewhat complicated, but involve what happens if the
  +        client is making a request at the same time the server
  +        sends a response and closes the connection. Without
  +        lingering, the client might be forced to reset its TCP
  +        input buffer before it has a chance to read the server's
  +        response, and thus understand why the connection has
  +        closed. See the <a href="#appendix">appendix</a> for more
  +        details. 
  +
  +        <p>The code in <code>lingering_close()</code> appears to
  +        cause problems for a number of factors, including the
  +        change in traffic patterns that it causes. The code has
  +        been thoroughly reviewed and we are not aware of any bugs
  +        in it. It is possible that there is some problem in the BSD
  +        TCP stack, aside from the lack of a timeout for the
  +        FIN_WAIT_2 state, exposed by the
  +        <code>lingering_close</code> code that causes the observed
  +        problems.</p>
  +      </li>
  +
  +      <li>
  +        What can I do about it? There are several possible
  +        workarounds to the problem, some of which work better than
  +        others. 
  +
  +        <h3>Add a timeout for FIN_WAIT_2</h3>
  +        The obvious workaround is to simply have a timeout for the
  +        FIN_WAIT_2 state. This is not specified by the RFC, and
  +        could be claimed to be a violation of the RFC, but it is
  +        widely recognized as being necessary. The following systems
  +        are known to have a timeout: 
  +
  +        <ul>
  +          <li><a href="http://www.freebsd.org/">FreeBSD</a>
  +          versions starting at 2.0 or possibly earlier.</li>
  +
  +          <li><a href="http://www.netbsd.org/">NetBSD</a> version
  +          1.2(?)</li>
  +
  +          <li><a href="http://www.openbsd.org/">OpenBSD</a> all
  +          versions(?)</li>
  +
  +          <li><a href="http://www.bsdi.com/">BSD/OS</a> 2.1, with
  +          the <a
  +          href="ftp://ftp.bsdi.com/bsdi/patches/patches-2.1/K210-027">
  +          K210-027</a> patch installed.</li>
  +
  +          <li><a href="http://www.sun.com/">Solaris</a> as of
  +          around version 2.2. The timeout can be tuned by using
  +          <code>ndd</code> to modify
  +          <code>tcp_fin_wait_2_flush_interval</code>, but the
  +          default should be appropriate for most servers and
  +          improper tuning can have negative impacts.</li>
  +
  +          <li><a href="http://www.linux.org/">Linux</a> 2.0.x and
  +          earlier(?)</li>
  +
  +          <li><a href="http://www.hp.com/">HP-UX</a> 10.x defaults
  +          to terminating connections in the FIN_WAIT_2 state after
  +          the normal keepalive timeouts. This does not refer to the
  +          persistent connection or HTTP keepalive timeouts, but the
  +          <code>SO_LINGER</code> socket option which is enabled by
  +          Apache. This parameter can be adjusted by using
  +          <code>nettune</code> to modify parameters such as
  +          <code>tcp_keepstart</code> and <code>tcp_keepstop</code>.
  +          In later revisions, there is an explicit timer for
  +          connections in FIN_WAIT_2 that can be modified; contact
  +          HP support for details.</li>
  +
  +          <li><a href="http://www.sgi.com/">SGI IRIX</a> can be
  +          patched to support a timeout. For IRIX 5.3, 6.2, and 6.3,
  +          use patches 1654, 1703 and 1778 respectively. If you have
  +          trouble locating these patches, please contact your SGI
  +          support channel for help.</li>
  +
  +          <li><a href="http://www.ncr.com/">NCR's MP RAS Unix</a>
  +          2.xx and 3.xx both have FIN_WAIT_2 timeouts. In 2.xx it
  +          is non-tunable at 600 seconds, while in 3.xx it defaults
  +          to 600 seconds and is calculated based on the tunable
  +          "max keep alive probes" (default of 8) multiplied by the
  +          "keep alive interval" (default 75 seconds).</li>
  +
  +          <li><a href="http://www.sequent.com">Sequent's ptx/TCP/IP
  +          for DYNIX/ptx</a> has had a FIN_WAIT_2 timeout since
  +          around release 4.1 in mid-1994.</li>
  +        </ul>
  +
  +        <p>The following systems are known to not have a
  +        timeout:</p>
  +
  +        <ul>
  +          <li><a href="http://www.sun.com/">SunOS 4.x</a> does not
  +          and almost certainly never will have one because it as at
  +          the very end of its development cycle for Sun. If you
  +          have kernel source should be easy to patch.</li>
  +        </ul>
  +
  +        <p>There is a <a
  +        href="http://www.apache.org/dist/httpd/contrib/patches/1.2/fin_wait_2.patch">
  +        patch available</a> for adding a timeout to the FIN_WAIT_2
  +        state; it was originally intended for BSD/OS, but should be
  +        adaptable to most systems using BSD networking code. You
  +        need kernel source code to be able to use it. If you do
  +        adapt it to work for any other systems, please drop me a
  +        note at <a
  +        href="mailto:marc@apache.org">marc@apache.org</a>.</p>
  +
  +        <h3>Compile without using
  +        <code>lingering_close()</code></h3>
  +        It is possible to compile Apache 1.2 without using the
  +        <code>lingering_close()</code> function. This will result
  +        in that section of code being similar to that which was in
  +        1.1. If you do this, be aware that it can cause problems
  +        with PUTs, POSTs and persistent connections, especially if
  +        the client uses pipelining. That said, it is no worse than
  +        on 1.1, and we understand that keeping your server running
  +        is quite important. 
  +
  +        <p>To compile without the <code>lingering_close()</code>
  +        function, add <code>-DNO_LINGCLOSE</code> to the end of the
  +        <code>EXTRA_CFLAGS</code> line in your
  +        <code>Configuration</code> file, rerun
  +        <code>Configure</code> and rebuild the server.</p>
  +
  +        <h3>Use <code>SO_LINGER</code> as an alternative to
  +        <code>lingering_close()</code></h3>
  +        On most systems, there is an option called
  +        <code>SO_LINGER</code> that can be set with
  +        <code>setsockopt(2)</code>. It does something very similar
  +        to <code>lingering_close()</code>, except that it is broken
  +        on many systems so that it causes far more problems than
  +        <code>lingering_close</code>. On some systems, it could
  +        possibly work better so it may be worth a try if you have
  +        no other alternatives. 
  +
  +        <p>To try it, add <code>-DUSE_SO_LINGER
  +        -DNO_LINGCLOSE</code> to the end of the
  +        <code>EXTRA_CFLAGS</code> line in your
  +        <code>Configuration</code> file, rerun
  +        <code>Configure</code> and rebuild the server.</p>
  +
  +        <p><strong>NOTE:</strong> Attempting to use
  +        <code>SO_LINGER</code> and <code>lingering_close()</code>
  +        at the same time is very likely to do very bad things, so
  +        don't.</p>
  +
  +        <h3>Increase the amount of memory used for storing
  +        connection state</h3>
  +
  +        <dl>
  +          <dt>BSD based networking code:</dt>
  +
  +          <dd>
  +            BSD stores network data, such as connection states, in
  +            something called an mbuf. When you get so many
  +            connections that the kernel does not have enough mbufs
  +            to put them all in, your kernel will likely crash. You
  +            can reduce the effects of the problem by increasing the
  +            number of mbufs that are available; this will not
  +            prevent the problem, it will just make the server go
  +            longer before crashing. 
  +
  +            <p>The exact way to increase them may depend on your
  +            OS; look for some reference to the number of "mbufs" or
  +            "mbuf clusters". On many systems, this can be done by
  +            adding the line <code>NMBCLUSTERS="n"</code>, where
  +            <code>n</code> is the number of mbuf clusters you want
  +            to your kernel config file and rebuilding your
  +            kernel.</p>
  +          </dd>
  +        </dl>
  +
  +        <h3>Disable KeepAlive</h3>
  +
  +        <p>If you are unable to do any of the above then you
  +        should, as a last resort, disable KeepAlive. Edit your
  +        httpd.conf and change "KeepAlive On" to "KeepAlive
  +        Off".</p>
  +      </li>
  +
  +      <li>
  +        Feedback If you have any information to add to this page,
  +        please contact me at <a
  +        href="mailto:marc@apache.org">marc@apache.org</a>. 
  +
  +        <h2><a id="appendix" name="appendix"></a></h2>
  +      </li>
  +
  +      <li>
  +        Appendix 
  +
  +        <p>Below is a message from Roy Fielding, one of the authors
  +        of HTTP/1.1.</p>
  +
  +        <h3>Why the lingering close functionality is necessary with
  +        HTTP</h3>
  +        The need for a server to linger on a socket after a close
  +        is noted a couple times in the HTTP specs, but not
  +        explained. This explanation is based on discussions between
  +        myself, Henrik Frystyk, Robert S. Thau, Dave Raggett, and
  +        John C. Mallery in the hallways of MIT while I was at W3C. 
  +
  +        <p>If a server closes the input side of the connection
  +        while the client is sending data (or is planning to send
  +        data), then the server's TCP stack will signal an RST
  +        (reset) back to the client. Upon receipt of the RST, the
  +        client will flush its own incoming TCP buffer back to the
  +        un-ACKed packet indicated by the RST packet argument. If
  +        the server has sent a message, usually an error response,
  +        to the client just before the close, and the client
  +        receives the RST packet before its application code has
  +        read the error message from its incoming TCP buffer and
  +        before the server has received the ACK sent by the client
  +        upon receipt of that buffer, then the RST will flush the
  +        error message before the client application has a chance to
  +        see it. The result is that the client is left thinking that
  +        the connection failed for no apparent reason.</p>
  +
  +        <p>There are two conditions under which this is likely to
  +        occur:</p>
  +
  +        <ol>
  +          <li>sending POST or PUT data without proper
  +          authorization</li>
  +
  +          <li>sending multiple requests before each response
  +          (pipelining) and one of the middle requests resulting in
  +          an error or other break-the-connection result.</li>
  +        </ol>
  +
  +        <p>The solution in all cases is to send the response, close
  +        only the write half of the connection (what shutdown is
  +        supposed to do), and continue reading on the socket until
  +        it is either closed by the client (signifying it has
  +        finally read the response) or a timeout occurs. That is
  +        what the kernel is supposed to do if SO_LINGER is set.
  +        Unfortunately, SO_LINGER has no effect on some systems; on
  +        some other systems, it does not have its own timeout and
  +        thus the TCP memory segments just pile-up until the next
  +        reboot (planned or not).</p>
  +
  +        <p>Please note that simply removing the linger code will
  +        not solve the problem -- it only moves it to a different
  +        and much harder one to detect.</p>
  +      </li>
  +    </ol>
  +    <!--#include virtual="footer.html" -->
  +  </body>
  +</html>
  +
  
  
  
  1.5       +17 -6     httpd-2.0/docs/manual/misc/footer.html
  
  Index: footer.html
  ===================================================================
  RCS file: /home/cvs/httpd-2.0/docs/manual/misc/footer.html,v
  retrieving revision 1.4
  retrieving revision 1.5
  diff -u -r1.4 -r1.5
  --- footer.html	2000/11/24 22:46:51	1.4
  +++ footer.html	2001/09/22 19:33:40	1.5
  @@ -1,8 +1,19 @@
  -<HR>
  +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  +    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
   
  -<H3 ALIGN="CENTER">
  - Apache HTTP Server Version 2.0 
  -</H3>
  +<html xmlns="http://www.w3.org/1999/xhtml">
  +  <head>
  +    <meta name="generator" content="HTML Tidy, see www.w3.org" />
   
  -<A HREF="./"><IMG SRC="../images/index.gif" ALT="Index"></A>
  -<A HREF="../"><IMG SRC="../images/home.gif" ALT="Home"></A>
  +    <title></title>
  +  </head>
  +
  +  <body>
  +    <hr />
  +
  +    <h3 align="CENTER">Apache HTTP Server Version 2.0</h3>
  +    <a href="./"><img src="../images/index.gif" alt="Index" /></a>
  +    <a href="../"><img src="../images/home.gif" alt="Home" /></a>
  +  </body>
  +</html>
  +
  
  
  
  1.5       +19 -6     httpd-2.0/docs/manual/misc/header.html
  
  Index: header.html
  ===================================================================
  RCS file: /home/cvs/httpd-2.0/docs/manual/misc/header.html,v
  retrieving revision 1.4
  retrieving revision 1.5
  diff -u -r1.4 -r1.5
  --- header.html	2000/11/24 22:46:50	1.4
  +++ header.html	2001/09/22 19:33:40	1.5
  @@ -1,6 +1,19 @@
  -<DIV ALIGN="CENTER">
  - <IMG SRC="../images/sub.gif" ALT="[APACHE DOCUMENTATION]">
  - <H3>
  -  Apache HTTP Server Version 2.0 
  - </H3>
  -</DIV>
  +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  +    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  +
  +<html xmlns="http://www.w3.org/1999/xhtml">
  +  <head>
  +    <meta name="generator" content="HTML Tidy, see www.w3.org" />
  +
  +    <title></title>
  +  </head>
  +
  +  <body>
  +    <div align="CENTER">
  +      <img src="../images/sub.gif" alt="[APACHE DOCUMENTATION]" /> 
  +
  +      <h3>Apache HTTP Server Version 2.0</h3>
  +    </div>
  +  </body>
  +</html>
  +
  
  
  
  1.18      +67 -69    httpd-2.0/docs/manual/misc/index.html
  
  Index: index.html
  ===================================================================
  RCS file: /home/cvs/httpd-2.0/docs/manual/misc/index.html,v
  retrieving revision 1.17
  retrieving revision 1.18
  diff -u -r1.17 -r1.18
  --- index.html	2001/09/22 15:45:22	1.17
  +++ index.html	2001/09/22 19:33:40	1.18
  @@ -1,69 +1,67 @@
  -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
  -<HTML>
  - <HEAD>
  -  <TITLE>Apache Miscellaneous Documentation</TITLE>
  - </HEAD>
  -
  - <!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  - <BODY
  -  BGCOLOR="#FFFFFF"
  -  TEXT="#000000"
  -  LINK="#0000FF"
  -  VLINK="#000080"
  -  ALINK="#FF0000"
  - >
  -  <!--#include virtual="header.html" -->
  -  <H1 ALIGN="CENTER">Apache Miscellaneous Documentation</H1>
  -
  -  <P>
  -  Below is a list of additional documentation pages that apply to the
  -  Apache web server development project.
  -  </P>
  -  <DL>
  -   <DT><A HREF="custom_errordocs.html">How to use XSSI and Negotiation 
  -	for custom ErrorDocuments</A>
  -   </DT>
  -   <DD>Describes a solution which uses XSSI and negotiation
  -    to custom-tailor the Apache ErrorDocuments to taste, adding the
  -    advantage of returning internationalized versions of the error
  -    messages depending on the client's language preferences.
  -   </DD>
  -   <DT><A HREF="descriptors.html">File Descriptor use in Apache</A>
  -   <DD>Describes how Apache uses file descriptors and talks about various
  -    limits imposed on the number of descriptors available by various 
  -    operating systems.
  -   </DD>
  -   <DT><A
  -        HREF="fin_wait_2.html"
  -       ><SAMP>FIN_WAIT_2</SAMP></A>
  -   </DT>
  -   <DD>A description of the causes of Apache processes going into the
  -    <SAMP>FIN_WAIT_2</SAMP> state, and what you can do about it.
  -   </DD>
  -   <DT><A
  -        HREF="known_client_problems.html"
  -       >Known Client Problems</A>
  -   </DT>
  -   <DD>A list of problems in HTTP clients which can be mitigated by Apache.
  -   </DD>
  -   <DT><A
  -        HREF="perf-tuning.html"
  -       >Performance Notes -- Apache Tuning</A>
  -   </DT>
  -   <DD>Notes about how to (run-time and compile-time) configure
  -       Apache for highest performance.  Notes explaining why Apache does
  -       some things, and why it doesn't do other things (which make it
  -       slower/faster).
  -   </DD>
  -   <DT><A
  -        HREF="security_tips.html"
  -       >Security Tips</A>
  -   </DT>
  -   <DD>Some &quot;do&quot;s  - and &quot;don't&quot;s - for keeping your
  -    Apache web site secure.
  -   </DD>
  -  </DL>
  -
  -  <!--#include virtual="footer.html" -->
  - </BODY>
  -</HTML>
  +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  +    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  +
  +<html xmlns="http://www.w3.org/1999/xhtml">
  +  <head>
  +    <meta name="generator" content="HTML Tidy, see www.w3.org" />
  +
  +    <title>Apache Miscellaneous Documentation</title>
  +  </head>
  +  <!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  +
  +  <body bgcolor="#FFFFFF" text="#000000" link="#0000FF"
  +  vlink="#000080" alink="#FF0000">
  +    <!--#include virtual="header.html" -->
  +
  +    <h1 align="CENTER">Apache Miscellaneous Documentation</h1>
  +
  +    <p>Below is a list of additional documentation pages that apply
  +    to the Apache web server development project.</p>
  +
  +    <dl>
  +      <dt><a href="custom_errordocs.html">How to use XSSI and
  +      Negotiation for custom ErrorDocuments</a></dt>
  +
  +      <dd>Describes a solution which uses XSSI and negotiation to
  +      custom-tailor the Apache ErrorDocuments to taste, adding the
  +      advantage of returning internationalized versions of the
  +      error messages depending on the client's language
  +      preferences.</dd>
  +
  +      <dt><a href="descriptors.html">File Descriptor use in
  +      Apache</a></dt>
  +
  +      <dd>Describes how Apache uses file descriptors and talks
  +      about various limits imposed on the number of descriptors
  +      available by various operating systems.</dd>
  +
  +      <dt><a
  +      href="fin_wait_2.html"><samp>FIN_WAIT_2</samp></a></dt>
  +
  +      <dd>A description of the causes of Apache processes going
  +      into the <samp>FIN_WAIT_2</samp> state, and what you can do
  +      about it.</dd>
  +
  +      <dt><a href="known_client_problems.html">Known Client
  +      Problems</a></dt>
  +
  +      <dd>A list of problems in HTTP clients which can be mitigated
  +      by Apache.</dd>
  +
  +      <dt><a href="perf-tuning.html">Performance Notes -- Apache
  +      Tuning</a></dt>
  +
  +      <dd>Notes about how to (run-time and compile-time) configure
  +      Apache for highest performance. Notes explaining why Apache
  +      does some things, and why it doesn't do other things (which
  +      make it slower/faster).</dd>
  +
  +      <dt><a href="security_tips.html">Security Tips</a></dt>
  +
  +      <dd>Some "do"s - and "don't"s - for keeping your Apache web
  +      site secure.</dd>
  +    </dl>
  +    <!--#include virtual="footer.html" -->
  +  </body>
  +</html>
  +
  
  
  
  1.20      +343 -302  httpd-2.0/docs/manual/misc/known_client_problems.html
  
  Index: known_client_problems.html
  ===================================================================
  RCS file: /home/cvs/httpd-2.0/docs/manual/misc/known_client_problems.html,v
  retrieving revision 1.19
  retrieving revision 1.20
  diff -u -r1.19 -r1.20
  --- known_client_problems.html	2001/03/28 21:26:29	1.19
  +++ known_client_problems.html	2001/09/22 19:33:40	1.20
  @@ -1,305 +1,346 @@
  -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
  -<HTML>
  -<HEAD>
  -<TITLE>Apache HTTP Server Project</TITLE>
  -</HEAD>
  -
  -<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  -<BODY
  - BGCOLOR="#FFFFFF"
  - TEXT="#000000"
  - LINK="#0000FF"
  - VLINK="#000080"
  - ALINK="#FF0000"
  ->
  -<!--#include virtual="header.html" -->
  -<H1 ALIGN="CENTER">Known Problems in Clients</H1>
  -
  -<P>Over time the Apache Group has discovered or been notified of problems
  -with various clients which we have had to work around, or explain.
  -This document describes these problems and the workarounds available.
  -It's not arranged in any particular order.  Some familiarity with the
  -standards is assumed, but not necessary.
  -
  -<P>For brevity, <EM>Navigator</EM> will refer to Netscape's Navigator
  -product (which in later versions was renamed "Communicator" and
  -various other names), and <EM>MSIE</EM> will refer to Microsoft's
  -Internet Explorer product.  All trademarks and copyrights belong to
  -their respective companies.  We welcome input from the various client
  -authors to correct inconsistencies in this paper, or to provide us with
  -exact version numbers where things are broken/fixed.
  -
  -<P>For reference,
  -<A HREF="ftp://ds.internic.net/rfc/rfc1945.txt">RFC1945</A>
  -defines HTTP/1.0, and
  -<A HREF="ftp://ds.internic.net/rfc/rfc2068.txt">RFC2068</A>
  -defines HTTP/1.1.  Apache as of version 1.2 is an HTTP/1.1 server (with an 
  -optional HTTP/1.0 proxy).
  -
  -<P>Various of these workarounds are triggered by environment variables.
  -The admin typically controls which are set, and for which clients, by using 
  -<A HREF="../mod/mod_browser.html">mod_browser</A>.  Unless otherwise
  -noted all of these workarounds exist in versions 1.2 and later.
  -
  -<H3><A NAME="trailing-crlf">Trailing CRLF on POSTs</A></H3>
  -
  -<P>This is a legacy issue.  The CERN webserver required <CODE>POST</CODE>
  -data to have an extra <CODE>CRLF</CODE> following it.  Thus many
  -clients send an extra <CODE>CRLF</CODE> that
  -is not included in the <CODE>Content-Length</CODE> of the request.
  -Apache works around this problem by eating any empty lines which
  -appear before a request.
  -
  -<H3><A NAME="broken-keepalive">Broken keepalive</A></H3>
  -
  -<P>Various clients have had broken implementations of <EM>keepalive</EM>
  -(persistent connections).  In particular the Windows versions of
  -Navigator 2.0 get very confused when the server times out an
  -idle connection.  The workaround is present in the default config files:
  -<BLOCKQUOTE><CODE>
  -BrowserMatch Mozilla/2 nokeepalive
  -</CODE></BLOCKQUOTE>
  -Note that this matches some earlier versions of MSIE, which began the
  -practice of calling themselves <EM>Mozilla</EM> in their user-agent
  -strings just like Navigator.
  -
  -<P>MSIE 4.0b2, which claims to support HTTP/1.1, does not properly
  -support keepalive when it is used on 301 or 302 (redirect)
  -responses.  Unfortunately Apache's <CODE>nokeepalive</CODE> code
  -prior to 1.2.2 would not work with HTTP/1.1 clients.  You must apply
  -<A
  -HREF="http://www.apache.org/dist/httpd/patches/apply_to_1.2.1/msie_4_0b2_fixes.patch"
  ->this patch</A> to version 1.2.1.  Then add this to your config:
  -<BLOCKQUOTE><CODE>
  -BrowserMatch "MSIE 4\.0b2;" nokeepalive
  -</CODE></BLOCKQUOTE>
  -
  -<H3><A NAME="force-response-1.0">Incorrect interpretation of
  -<CODE>HTTP/1.1</CODE> in response</A></H3>
  -
  -<P>To quote from section 3.1 of RFC1945:
  -<BLOCKQUOTE>
  -HTTP uses a "&lt;MAJOR&gt;.&lt;MINOR&gt;" numbering scheme to indicate versions
  -of the protocol. The protocol versioning policy is intended to allow
  -the sender to indicate the format of a message and its capacity for
  -understanding further HTTP communication, rather than the features
  -obtained via that communication.
  -</BLOCKQUOTE>
  -Since Apache is an HTTP/1.1 server, it indicates so as part of its
  -response.  Many client authors mistakenly treat this part of the response
  -as an indication of the protocol that the response is in, and then refuse
  -to accept the response.
  -
  -<P>The first major indication of this problem was with AOL's proxy servers.
  -When Apache 1.2 went into beta it was the first wide-spread HTTP/1.1
  -server.  After some discussion, AOL fixed their proxies.  In
  -anticipation of similar problems, the <CODE>force-response-1.0</CODE>
  -environment variable was added to Apache.  When present Apache will
  -indicate "HTTP/1.0" in response to an HTTP/1.0 client,
  -but will not in any other way change the response.
  -
  -<P>The pre-1.1 Java Development Kit (JDK) that is used in many clients
  -(including Navigator 3.x and MSIE 3.x) exhibits this problem.  As do some
  -of the early pre-releases of the 1.1 JDK.  We think it is fixed in the
  -1.1 JDK release.  In any event the workaround:
  -<BLOCKQUOTE><CODE>
  -BrowserMatch Java/1.0 force-response-1.0 <BR>
  -BrowserMatch JDK/1.0 force-response-1.0 
  -</CODE></BLOCKQUOTE>
  -
  -<P>RealPlayer 4.0 from Progressive Networks also exhibits this problem.
  -However they have fixed it in version 4.01 of the player, but version
  -4.01 uses the same <CODE>User-Agent</CODE> as version 4.0.  The
  -workaround is still:
  -<BLOCKQUOTE><CODE>
  -BrowserMatch "RealPlayer 4.0" force-response-1.0
  -</CODE></BLOCKQUOTE>
  -
  -<H3><A NAME="msie4.0b2">Requests use HTTP/1.1 but responses must be
  -in HTTP/1.0</A></H3>
  -
  -<P>MSIE 4.0b2 has this problem.  Its Java VM makes requests in HTTP/1.1
  -format but the responses must be in HTTP/1.0 format (in particular, it
  -does not understand <EM>chunked</EM> responses).  The workaround
  -is to fool Apache into believing the request came in HTTP/1.0 format.
  -<BLOCKQUOTE><CODE>
  -BrowserMatch "MSIE 4\.0b2;" downgrade-1.0 force-response-1.0
  -</CODE></BLOCKQUOTE>
  -This workaround is available in 1.2.2, and in a
  -<A
  -HREF="http://www.apache.org/dist/httpd/patches/apply_to_1.2.1/msie_4_0b2_fixes.patch"
  ->patch</A> against 1.2.1.
  -
  -<H3><A NAME="257th-byte">Boundary problems with header parsing</A></H3>
  -
  -<P>All versions of Navigator from 2.0 through 4.0b2 (and possibly later)
  -have a problem if the trailing CRLF of the response header starts at
  -offset 256, 257 or 258 of the response.  A BrowserMatch for this would
  -match on nearly every hit, so the workaround is enabled automatically
  -on all responses.  The workaround implemented detects when this condition would
  -occur in a response and adds extra padding to the header to push the
  -trailing CRLF past offset 258 of the response.
  -
  -<H3><A NAME="boundary-string">Multipart responses and Quoted Boundary
  -Strings</A></H3>
  -
  -<P>On multipart responses some clients will not accept quotes (")
  -around the boundary string.  The MIME standard recommends that
  -such quotes be used.  But the clients were probably written based
  -on one of the examples in RFC2068, which does not include quotes.
  -Apache does not include quotes on its boundary strings to workaround
  -this problem.
  -
  -<H3><A NAME="byterange-requests">Byterange requests</A></H3>
  -
  -<P>A byterange request is used when the client wishes to retrieve a
  -portion of an object, not necessarily the entire object.  There
  -was a very old draft which included these byteranges in the URL.
  -Old clients such as Navigator 2.0b1 and MSIE 3.0 for the MAC
  -exhibit this behaviour, and
  -it will appear in the servers' access logs as (failed) attempts to
  -retrieve a URL with a trailing ";xxx-yyy".  Apache does not attempt
  -to implement this at all.
  -
  -<P>A subsequent draft of this standard defines a header
  -<CODE>Request-Range</CODE>, and a response type
  -<CODE>multipart/x-byteranges</CODE>.  The HTTP/1.1 standard includes
  -this draft with a few fixes, and it defines the header
  -<CODE>Range</CODE> and type <CODE>multipart/byteranges</CODE>.
  -
  -<P>Navigator (versions 2 and 3) sends both <CODE>Range</CODE> and
  -<CODE>Request-Range</CODE> headers (with the same value), but does not
  -accept a <CODE>multipart/byteranges</CODE> response.  The response must
  -be <CODE>multipart/x-byteranges</CODE>.  As a workaround, if Apache
  -receives a <CODE>Request-Range</CODE> header it considers it "higher
  -priority" than a <CODE>Range</CODE> header and in response uses
  -<CODE>multipart/x-byteranges</CODE>.
  -
  -<P>The Adobe Acrobat Reader plugin makes extensive use of byteranges and
  -prior to version 3.01 supports only the <CODE>multipart/x-byterange</CODE>
  -response.  Unfortunately there is no clue that it is the plugin
  -making the request.  If the plugin is used with Navigator, the above
  -workaround works fine.  But if the plugin is used with MSIE 3 (on
  -Windows) the workaround won't work because MSIE 3 doesn't give the
  -<CODE>Range-Request</CODE> clue that Navigator does.  To workaround this,
  -Apache special cases "MSIE 3" in the <CODE>User-Agent</CODE> and serves
  -<CODE>multipart/x-byteranges</CODE>.  Note that the necessity for this
  -with MSIE 3 is actually due to the Acrobat plugin, not due to the browser.
  -
  -<P>Netscape Communicator appears to not issue the non-standard
  -<CODE>Request-Range</CODE> header.  When an Acrobat plugin prior to
  -version 3.01 is used with it, it will not properly understand byteranges.
  -The user must upgrade their Acrobat reader to 3.01.
  -
  -<H3><A NAME="cookie-merge"><CODE>Set-Cookie</CODE> header is
  -unmergeable</A></H3>
  -
  -<P>The HTTP specifications say that it is legal to merge headers with
  -duplicate names into one (separated by commas).  Some browsers
  -that support Cookies don't like merged headers and prefer that each
  -<CODE>Set-Cookie</CODE> header is sent separately.  When parsing the
  -headers returned by a CGI, Apache will explicitly avoid merging any
  -<CODE>Set-Cookie</CODE> headers.
  -
  -<H3><A NAME="gif89-expires"><CODE>Expires</CODE> headers and GIF89A
  -animations</A></H3>
  -
  -<P>Navigator versions 2 through 4 will erroneously re-request
  -GIF89A animations on each loop of the animation if the first
  -response included an <CODE>Expires</CODE> header.  This happens
  -regardless of how far in the future the expiry time is set.  There
  -is no workaround supplied with Apache, however there are hacks for <A
  -HREF="http://www.arctic.org/~dgaudet/patches/apache-1.2-gif89-expires-hack.patch">1.2</A>
  -and for <A
  -HREF="http://www.arctic.org/~dgaudet/patches/apache-1.3-gif89-expires-hack.patch">1.3</A>.
  -
  -<H3><A NAME="no-content-length"><CODE>POST</CODE> without
  -<CODE>Content-Length</CODE></A></H3>
  -
  -<P>In certain situations Navigator 3.01 through 3.03 appear to incorrectly
  -issue a POST without the request body.  There is no
  -known workaround.  It has been fixed in Navigator 3.04, Netscapes
  -provides some
  -<A HREF="http://help.netscape.com/kb/client/971014-42.html">information</A>.
  -There's also
  -<A HREF="http://www.arctic.org/~dgaudet/apache/no-content-length/">
  -some information</A> about the actual problem.
  -
  -<H3><A NAME="jdk-12-bugs">JDK 1.2 betas lose parts of responses.</A></H3>
  -
  -<P>The http client in the JDK1.2beta2 and beta3 will throw away the first part of
  -the response body when both the headers and the first part of the body are sent
  -in the same network packet AND keep-alive's are being used. If either condition
  -is not met then it works fine.
  -
  -<P>See also Bug-ID's 4124329 and 4125538 at the java developer connection.
  -
  -<P>If you are seeing this bug yourself, you can add the following BrowserMatch
  -directive to work around it:
  -
  -<BLOCKQUOTE><CODE>
  -BrowserMatch "Java1\.2beta[23]" nokeepalive
  -</CODE></BLOCKQUOTE>
  -
  -<P>We don't advocate this though since bending over backwards for beta software
  -is usually not a good idea; ideally it gets fixed, new betas or a final release
  -comes out, and no one uses the broken old software anymore.  In theory.
  -
  -<H3><A NAME="content-type-persistence"><CODE>Content-Type</CODE> change
  -is not noticed after reload</A></H3>
  -
  -<P>Navigator (all versions?) will cache the <CODE>content-type</CODE>
  -for an object "forever".  Using reload or shift-reload will not cause
  -Navigator to notice a <CODE>content-type</CODE> change.  The only
  -work-around is for the user to flush their caches (memory and disk).  By
  -way of an example, some folks may be using an old <CODE>mime.types</CODE>
  -file which does not map <CODE>.htm</CODE> to <CODE>text/html</CODE>,
  -in this case Apache will default to sending <CODE>text/plain</CODE>.
  -If the user requests the page and it is served as <CODE>text/plain</CODE>.
  -After the admin fixes the server, the user will have to flush their caches
  -before the object will be shown with the correct <CODE>text/html</CODE>
  -type.
  -
  -<h3><a name="msie-cookie-y2k">MSIE Cookie problem with expiry date in
  -the year 2000</a></h3>
  -
  -<p>MSIE versions 3.00 and 3.02 (without the Y2K patch) do not handle
  -cookie expiry dates in the year 2000 properly.  Years after 2000 and
  -before 2000 work fine.  This is fixed in IE4.01 service pack 1, and in
  -the Y2K patch for IE3.02.  Users should avoid using expiry dates in the
  -year 2000.
  -
  -<h3><a name="lynx-negotiate-trans">Lynx incorrectly asking for transparent
  -content negotiation</a></h3>
  -
  -<p>The Lynx browser versions 2.7 and 2.8 send a "negotiate: trans" header
  -in their requests, which is an indication the browser supports transparent
  -content negotiation (TCN).  However the browser does not support TCN.
  -As of version 1.3.4, Apache supports TCN, and this causes problems with
  -these versions of Lynx.  As a workaround future versions of Apache will
  -ignore this header when sent by the Lynx client.
  -
  -<h3><a name="ie40-vary">MSIE 4.0 mishandles Vary response header</a></h3>
  -
  -<p>MSIE 4.0 does not handle a Vary header properly.  The Vary header is
  -generated by mod_rewrite in apache 1.3.  The result is an error from MSIE
  -saying it cannot download the requested file.  There are more details
  -in <a href="http://bugs.apache.org/index/full/4118">PR#4118</a>.
  -</P>
  -<P>
  -A workaround is to add the following to your server's configuration
  -files:
  -</P>
  -<PRE>
  -    BrowserMatch "MSIE 4\.0" force-no-vary
  -</PRE>
  -<P>
  -(This workaround is only available with releases <STRONG>after</STRONG>
  -1.3.6 of the Apache Web server.)
  -</P>
  +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  +    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
   
  +<html xmlns="http://www.w3.org/1999/xhtml">
  +  <head>
  +    <meta name="generator" content="HTML Tidy, see www.w3.org" />
  +
  +    <title>Apache HTTP Server Project</title>
  +  </head>
  +  <!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  +
  +  <body bgcolor="#FFFFFF" text="#000000" link="#0000FF"
  +  vlink="#000080" alink="#FF0000">
  +    <!--#include virtual="header.html" -->
  +
  +    <h1 align="CENTER">Known Problems in Clients</h1>
  +
  +    <p>Over time the Apache Group has discovered or been notified
  +    of problems with various clients which we have had to work
  +    around, or explain. This document describes these problems and
  +    the workarounds available. It's not arranged in any particular
  +    order. Some familiarity with the standards is assumed, but not
  +    necessary.</p>
  +
  +    <p>For brevity, <em>Navigator</em> will refer to Netscape's
  +    Navigator product (which in later versions was renamed
  +    "Communicator" and various other names), and <em>MSIE</em> will
  +    refer to Microsoft's Internet Explorer product. All trademarks
  +    and copyrights belong to their respective companies. We welcome
  +    input from the various client authors to correct
  +    inconsistencies in this paper, or to provide us with exact
  +    version numbers where things are broken/fixed.</p>
  +
  +    <p>For reference, <a
  +    href="ftp://ds.internic.net/rfc/rfc1945.txt">RFC1945</a>
  +    defines HTTP/1.0, and <a
  +    href="ftp://ds.internic.net/rfc/rfc2068.txt">RFC2068</a>
  +    defines HTTP/1.1. Apache as of version 1.2 is an HTTP/1.1
  +    server (with an optional HTTP/1.0 proxy).</p>
  +
  +    <p>Various of these workarounds are triggered by environment
  +    variables. The admin typically controls which are set, and for
  +    which clients, by using <a
  +    href="../mod/mod_browser.html">mod_browser</a>. Unless
  +    otherwise noted all of these workarounds exist in versions 1.2
  +    and later.</p>
  +
  +    <h3><a id="trailing-crlf" name="trailing-crlf">Trailing CRLF on
  +    POSTs</a></h3>
  +
  +    <p>This is a legacy issue. The CERN webserver required
  +    <code>POST</code> data to have an extra <code>CRLF</code>
  +    following it. Thus many clients send an extra <code>CRLF</code>
  +    that is not included in the <code>Content-Length</code> of the
  +    request. Apache works around this problem by eating any empty
  +    lines which appear before a request.</p>
  +
  +    <h3><a id="broken-keepalive" name="broken-keepalive">Broken
  +    keepalive</a></h3>
  +
  +    <p>Various clients have had broken implementations of
  +    <em>keepalive</em> (persistent connections). In particular the
  +    Windows versions of Navigator 2.0 get very confused when the
  +    server times out an idle connection. The workaround is present
  +    in the default config files:</p>
  +
  +    <blockquote>
  +      <code>BrowserMatch Mozilla/2 nokeepalive</code>
  +    </blockquote>
  +    Note that this matches some earlier versions of MSIE, which
  +    began the practice of calling themselves <em>Mozilla</em> in
  +    their user-agent strings just like Navigator. 
  +
  +    <p>MSIE 4.0b2, which claims to support HTTP/1.1, does not
  +    properly support keepalive when it is used on 301 or 302
  +    (redirect) responses. Unfortunately Apache's
  +    <code>nokeepalive</code> code prior to 1.2.2 would not work
  +    with HTTP/1.1 clients. You must apply <a
  +    href="http://www.apache.org/dist/httpd/patches/apply_to_1.2.1/msie_4_0b2_fixes.patch">
  +    this patch</a> to version 1.2.1. Then add this to your
  +    config:</p>
  +
  +    <blockquote>
  +      <code>BrowserMatch "MSIE 4\.0b2;" nokeepalive</code>
  +    </blockquote>
  +
  +    <h3><a id="force-response-1.0"
  +    name="force-response-1.0">Incorrect interpretation of
  +    <code>HTTP/1.1</code> in response</a></h3>
  +
  +    <p>To quote from section 3.1 of RFC1945:</p>
  +
  +    <blockquote>
  +      HTTP uses a "&lt;MAJOR&gt;.&lt;MINOR&gt;" numbering scheme to
  +      indicate versions of the protocol. The protocol versioning
  +      policy is intended to allow the sender to indicate the format
  +      of a message and its capacity for understanding further HTTP
  +      communication, rather than the features obtained via that
  +      communication.
  +    </blockquote>
  +    Since Apache is an HTTP/1.1 server, it indicates so as part of
  +    its response. Many client authors mistakenly treat this part of
  +    the response as an indication of the protocol that the response
  +    is in, and then refuse to accept the response. 
  +
  +    <p>The first major indication of this problem was with AOL's
  +    proxy servers. When Apache 1.2 went into beta it was the first
  +    wide-spread HTTP/1.1 server. After some discussion, AOL fixed
  +    their proxies. In anticipation of similar problems, the
  +    <code>force-response-1.0</code> environment variable was added
  +    to Apache. When present Apache will indicate "HTTP/1.0" in
  +    response to an HTTP/1.0 client, but will not in any other way
  +    change the response.</p>
  +
  +    <p>The pre-1.1 Java Development Kit (JDK) that is used in many
  +    clients (including Navigator 3.x and MSIE 3.x) exhibits this
  +    problem. As do some of the early pre-releases of the 1.1 JDK.
  +    We think it is fixed in the 1.1 JDK release. In any event the
  +    workaround:</p>
  +
  +    <blockquote>
  +      <code>BrowserMatch Java/1.0 force-response-1.0<br />
  +       BrowserMatch JDK/1.0 force-response-1.0</code>
  +    </blockquote>
  +
  +    <p>RealPlayer 4.0 from Progressive Networks also exhibits this
  +    problem. However they have fixed it in version 4.01 of the
  +    player, but version 4.01 uses the same <code>User-Agent</code>
  +    as version 4.0. The workaround is still:</p>
  +
  +    <blockquote>
  +      <code>BrowserMatch "RealPlayer 4.0" force-response-1.0</code>
  +    </blockquote>
  +
  +    <h3><a id="msie4.0b2" name="msie4.0b2">Requests use HTTP/1.1
  +    but responses must be in HTTP/1.0</a></h3>
  +
  +    <p>MSIE 4.0b2 has this problem. Its Java VM makes requests in
  +    HTTP/1.1 format but the responses must be in HTTP/1.0 format
  +    (in particular, it does not understand <em>chunked</em>
  +    responses). The workaround is to fool Apache into believing the
  +    request came in HTTP/1.0 format.</p>
  +
  +    <blockquote>
  +      <code>BrowserMatch "MSIE 4\.0b2;" downgrade-1.0
  +      force-response-1.0</code>
  +    </blockquote>
  +    This workaround is available in 1.2.2, and in a <a
  +    href="http://www.apache.org/dist/httpd/patches/apply_to_1.2.1/msie_4_0b2_fixes.patch">
  +    patch</a> against 1.2.1. 
  +
  +    <h3><a id="257th-byte" name="257th-byte">Boundary problems with
  +    header parsing</a></h3>
  +
  +    <p>All versions of Navigator from 2.0 through 4.0b2 (and
  +    possibly later) have a problem if the trailing CRLF of the
  +    response header starts at offset 256, 257 or 258 of the
  +    response. A BrowserMatch for this would match on nearly every
  +    hit, so the workaround is enabled automatically on all
  +    responses. The workaround implemented detects when this
  +    condition would occur in a response and adds extra padding to
  +    the header to push the trailing CRLF past offset 258 of the
  +    response.</p>
  +
  +    <h3><a id="boundary-string" name="boundary-string">Multipart
  +    responses and Quoted Boundary Strings</a></h3>
  +
  +    <p>On multipart responses some clients will not accept quotes
  +    (") around the boundary string. The MIME standard recommends
  +    that such quotes be used. But the clients were probably written
  +    based on one of the examples in RFC2068, which does not include
  +    quotes. Apache does not include quotes on its boundary strings
  +    to workaround this problem.</p>
  +
  +    <h3><a id="byterange-requests"
  +    name="byterange-requests">Byterange requests</a></h3>
  +
  +    <p>A byterange request is used when the client wishes to
  +    retrieve a portion of an object, not necessarily the entire
  +    object. There was a very old draft which included these
  +    byteranges in the URL. Old clients such as Navigator 2.0b1 and
  +    MSIE 3.0 for the MAC exhibit this behaviour, and it will appear
  +    in the servers' access logs as (failed) attempts to retrieve a
  +    URL with a trailing ";xxx-yyy". Apache does not attempt to
  +    implement this at all.</p>
  +
  +    <p>A subsequent draft of this standard defines a header
  +    <code>Request-Range</code>, and a response type
  +    <code>multipart/x-byteranges</code>. The HTTP/1.1 standard
  +    includes this draft with a few fixes, and it defines the header
  +    <code>Range</code> and type
  +    <code>multipart/byteranges</code>.</p>
  +
  +    <p>Navigator (versions 2 and 3) sends both <code>Range</code>
  +    and <code>Request-Range</code> headers (with the same value),
  +    but does not accept a <code>multipart/byteranges</code>
  +    response. The response must be
  +    <code>multipart/x-byteranges</code>. As a workaround, if Apache
  +    receives a <code>Request-Range</code> header it considers it
  +    "higher priority" than a <code>Range</code> header and in
  +    response uses <code>multipart/x-byteranges</code>.</p>
  +
  +    <p>The Adobe Acrobat Reader plugin makes extensive use of
  +    byteranges and prior to version 3.01 supports only the
  +    <code>multipart/x-byterange</code> response. Unfortunately
  +    there is no clue that it is the plugin making the request. If
  +    the plugin is used with Navigator, the above workaround works
  +    fine. But if the plugin is used with MSIE 3 (on Windows) the
  +    workaround won't work because MSIE 3 doesn't give the
  +    <code>Range-Request</code> clue that Navigator does. To
  +    workaround this, Apache special cases "MSIE 3" in the
  +    <code>User-Agent</code> and serves
  +    <code>multipart/x-byteranges</code>. Note that the necessity
  +    for this with MSIE 3 is actually due to the Acrobat plugin, not
  +    due to the browser.</p>
  +
  +    <p>Netscape Communicator appears to not issue the non-standard
  +    <code>Request-Range</code> header. When an Acrobat plugin prior
  +    to version 3.01 is used with it, it will not properly
  +    understand byteranges. The user must upgrade their Acrobat
  +    reader to 3.01.</p>
  +
  +    <h3><a id="cookie-merge"
  +    name="cookie-merge"><code>Set-Cookie</code> header is
  +    unmergeable</a></h3>
  +
  +    <p>The HTTP specifications say that it is legal to merge
  +    headers with duplicate names into one (separated by commas).
  +    Some browsers that support Cookies don't like merged headers
  +    and prefer that each <code>Set-Cookie</code> header is sent
  +    separately. When parsing the headers returned by a CGI, Apache
  +    will explicitly avoid merging any <code>Set-Cookie</code>
  +    headers.</p>
  +
  +    <h3><a id="gif89-expires"
  +    name="gif89-expires"><code>Expires</code> headers and GIF89A
  +    animations</a></h3>
  +
  +    <p>Navigator versions 2 through 4 will erroneously re-request
  +    GIF89A animations on each loop of the animation if the first
  +    response included an <code>Expires</code> header. This happens
  +    regardless of how far in the future the expiry time is set.
  +    There is no workaround supplied with Apache, however there are
  +    hacks for <a
  +    href="http://www.arctic.org/~dgaudet/patches/apache-1.2-gif89-expires-hack.patch">
  +    1.2</a> and for <a
  +    href="http://www.arctic.org/~dgaudet/patches/apache-1.3-gif89-expires-hack.patch">
  +    1.3</a>.</p>
  +
  +    <h3><a id="no-content-length"
  +    name="no-content-length"><code>POST</code> without
  +    <code>Content-Length</code></a></h3>
  +
  +    <p>In certain situations Navigator 3.01 through 3.03 appear to
  +    incorrectly issue a POST without the request body. There is no
  +    known workaround. It has been fixed in Navigator 3.04,
  +    Netscapes provides some <a
  +    href="http://help.netscape.com/kb/client/971014-42.html">information</a>.
  +    There's also <a
  +    href="http://www.arctic.org/~dgaudet/apache/no-content-length/">
  +    some information</a> about the actual problem.</p>
  +
  +    <h3><a id="jdk-12-bugs" name="jdk-12-bugs">JDK 1.2 betas lose
  +    parts of responses.</a></h3>
  +
  +    <p>The http client in the JDK1.2beta2 and beta3 will throw away
  +    the first part of the response body when both the headers and
  +    the first part of the body are sent in the same network packet
  +    AND keep-alive's are being used. If either condition is not met
  +    then it works fine.</p>
  +
  +    <p>See also Bug-ID's 4124329 and 4125538 at the java developer
  +    connection.</p>
  +
  +    <p>If you are seeing this bug yourself, you can add the
  +    following BrowserMatch directive to work around it:</p>
  +
  +    <blockquote>
  +      <code>BrowserMatch "Java1\.2beta[23]" nokeepalive</code>
  +    </blockquote>
  +
  +    <p>We don't advocate this though since bending over backwards
  +    for beta software is usually not a good idea; ideally it gets
  +    fixed, new betas or a final release comes out, and no one uses
  +    the broken old software anymore. In theory.</p>
  +
  +    <h3><a id="content-type-persistence"
  +    name="content-type-persistence"><code>Content-Type</code>
  +    change is not noticed after reload</a></h3>
  +
  +    <p>Navigator (all versions?) will cache the
  +    <code>content-type</code> for an object "forever". Using reload
  +    or shift-reload will not cause Navigator to notice a
  +    <code>content-type</code> change. The only work-around is for
  +    the user to flush their caches (memory and disk). By way of an
  +    example, some folks may be using an old <code>mime.types</code>
  +    file which does not map <code>.htm</code> to
  +    <code>text/html</code>, in this case Apache will default to
  +    sending <code>text/plain</code>. If the user requests the page
  +    and it is served as <code>text/plain</code>. After the admin
  +    fixes the server, the user will have to flush their caches
  +    before the object will be shown with the correct
  +    <code>text/html</code> type.</p>
  +
  +    <h3><a id="msie-cookie-y2k" name="msie-cookie-y2k">MSIE Cookie
  +    problem with expiry date in the year 2000</a></h3>
  +
  +    <p>MSIE versions 3.00 and 3.02 (without the Y2K patch) do not
  +    handle cookie expiry dates in the year 2000 properly. Years
  +    after 2000 and before 2000 work fine. This is fixed in IE4.01
  +    service pack 1, and in the Y2K patch for IE3.02. Users should
  +    avoid using expiry dates in the year 2000.</p>
  +
  +    <h3><a id="lynx-negotiate-trans"
  +    name="lynx-negotiate-trans">Lynx incorrectly asking for
  +    transparent content negotiation</a></h3>
  +
  +    <p>The Lynx browser versions 2.7 and 2.8 send a "negotiate:
  +    trans" header in their requests, which is an indication the
  +    browser supports transparent content negotiation (TCN). However
  +    the browser does not support TCN. As of version 1.3.4, Apache
  +    supports TCN, and this causes problems with these versions of
  +    Lynx. As a workaround future versions of Apache will ignore
  +    this header when sent by the Lynx client.</p>
  +
  +    <h3><a id="ie40-vary" name="ie40-vary">MSIE 4.0 mishandles Vary
  +    response header</a></h3>
  +
  +    <p>MSIE 4.0 does not handle a Vary header properly. The Vary
  +    header is generated by mod_rewrite in apache 1.3. The result is
  +    an error from MSIE saying it cannot download the requested
  +    file. There are more details in <a
  +    href="http://bugs.apache.org/index/full/4118">PR#4118</a>.</p>
  +
  +    <p>A workaround is to add the following to your server's
  +    configuration files:</p>
  +<pre>
  +    BrowserMatch "MSIE 4\.0" force-no-vary
  +</pre>
   
  -<!--#include virtual="footer.html" -->
  -</BODY>
  -</HTML>
  +    <p>(This workaround is only available with releases
  +    <strong>after</strong> 1.3.6 of the Apache Web server.)</p>
  +    <!--#include virtual="footer.html" -->
  +  </body>
  +</html>
   
  
  
  
  1.29      +922 -802  httpd-2.0/docs/manual/misc/perf-tuning.html
  
  Index: perf-tuning.html
  ===================================================================
  RCS file: /home/cvs/httpd-2.0/docs/manual/misc/perf-tuning.html,v
  retrieving revision 1.28
  retrieving revision 1.29
  diff -u -r1.28 -r1.29
  --- perf-tuning.html	2001/09/16 20:12:16	1.28
  +++ perf-tuning.html	2001/09/22 19:33:40	1.29
  @@ -1,182 +1,218 @@
  -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
  -<HTML>
  -<HEAD>
  - <TITLE>Apache Performance Notes</TITLE>
  -</HEAD>
  -<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  -<BODY
  - BGCOLOR="#FFFFFF"
  - TEXT="#000000"
  - LINK="#0000FF"
  - VLINK="#000080"
  - ALINK="#FF0000"
  ->
  -<!--#include virtual="header.html" -->
  -
  -<blockquote><strong>Warning:</strong>
  -This document has not been updated to take into account changes
  -made in the 2.0 version of the Apache HTTP Server.  Some of the
  -information may still be relevant, but please use it
  -with care.
  -</blockquote>
  -
  -<H1 align="center">Apache Performance Notes</H1>
  -
  -<P>Author: Dean Gaudet
  -
  -<ul>
  -<li><a href="#introduction">Introduction</a></li>
  -<li><a href="#hardware">Hardware and Operating System Issues</a></li>
  -<li><a href="#runtime">Run-Time Configuration Issues</a></li>
  -<li><a href="#compiletime">Compile-Time Configuration Issues</a></li>
  -<li>Appendixes
  -  <ul>
  -  <li><a href="#trace">Detailed Analysis of a Trace</a></li>
  -  <li><a href="#patches">Patches Available</a></li>
  -  <li><a href="#preforking">The Pre-Forking Model</a></li>
  -  </ul></li>
  -</ul>
  -
  -<hr>
  -
  -<table border="1">
  -<tr><td valign="top">
  -<strong>Related Modules</strong><br><br>
  -
  -<a href="../mod/mod_dir.html">mod_dir</a><br>
  -<a href="../mod/mpm_common.html">Multi-Processing module</a><br>
  -<a href="../mod/mod_status.html">mod_status</a><br>
  -
  -</td><td valign="top">
  -<strong>Related Directives</strong><br><br>
  -
  -<a href="../mod/core.html#allowoverride">AllowOverride</a><br>
  -<a href="../mod/mod_dir.html#directoryindex">DirectoryIndex</a><br>
  -<a href="../mod/core.html#hostnamelookups">HostNameLookups</a><br>
  -<a href="../mod/core.html#keepalivetimeout">KeepAliveTimeout</a><br>
  -<a href="../mod/prefork.html#maxspareservers">MaxSpareServers</a><br>
  -<a href="../mod/prefork.html#mixspareservers">MinSpareServers</a><br>
  -<a href="../mod/core.html#options">Options</a> (FollowSymLinks and
  -FollowIfOwnerMatch)<br>
  -<a href="../mod/mpm_common.html#startservers">StartServers</a><br>
  -
  -</td></tr></table>
  -
  -<H3><a name="introduction">Introduction</A></H3>
  -<P>Apache is a general webserver, which is designed to be correct first, and
  -fast second.  Even so, its performance is quite satisfactory.  Most
  -sites have less than 10Mbits of outgoing bandwidth, which Apache can
  -fill using only a low end Pentium-based webserver.  In practice sites
  -with more bandwidth require more than one machine to fill the bandwidth
  -due to other constraints (such as CGI or database transaction overhead).
  -For these reasons the development focus has been mostly on correctness
  -and configurability.
  -
  -<P>Unfortunately many folks overlook these facts and cite raw performance
  -numbers as if they are some indication of the quality of a web server
  -product.  There is a bare minimum performance that is acceptable, beyond
  -that extra speed only caters to a much smaller segment of the market.
  -But in order to avoid this hurdle to the acceptance of Apache in some
  -markets, effort was put into Apache 1.3 to bring performance up to a
  -point where the difference with other high-end webservers is minimal.
  -
  -<P>Finally there are the folks who just plain want to see how fast something
  -can go.  The author falls into this category.  The rest of this document
  -is dedicated to these folks who want to squeeze every last bit of
  -performance out of Apache's current model, and want to understand why
  -it does some things which slow it down.
  -
  -<P>Note that this is tailored towards Apache 1.3 on Unix.  Some of it applies
  -to Apache on NT.  Apache on NT has not been tuned for performance yet;
  -in fact it probably performs very poorly because NT performance requires
  -a different programming model.
  -
  -<hr>
  -
  -<H3><a name="hardware">Hardware and Operating System Issues</a></H3>
  -
  -<P>The single biggest hardware issue affecting webserver performance
  -is RAM.  A webserver should never ever have to swap, swapping increases
  -the latency of each request beyond a point that users consider "fast
  -enough".  This causes users to hit stop and reload, further increasing
  -the load.  You can, and should, control the <CODE>MaxClients</CODE>
  -setting so that your server does not spawn so many children it starts
  -swapping.
  -
  -<P>Beyond that the rest is mundane:  get a fast enough CPU, a fast enough
  -network card, and fast enough disks, where "fast enough" is something
  -that needs to be determined by experimentation.
  -
  -<P>Operating system choice is largely a matter of local concerns.  But
  -a general guideline is to always apply the latest vendor TCP/IP patches.
  -HTTP serving completely breaks many of the assumptions built into Unix
  -kernels up through 1994 and even 1995.  Good choices include
  -recent FreeBSD, and Linux.
  -
  -<hr>
  -
  -<H3><a name="runtime">Run-Time Configuration Issues</a></H3>
  -
  -<H4>HostnameLookups</H4>
  -<P>Prior to Apache 1.3, <CODE>HostnameLookups</CODE> defaulted to On.
  -This adds latency
  -to every request because it requires a DNS lookup to complete before
  -the request is finished.  In Apache 1.3 this setting defaults to Off.
  -However (1.3 or later), if you use any <CODE>Allow from domain</CODE> or
  -<CODE>Deny from domain</CODE> directives then you will pay for a
  -double reverse DNS lookup (a reverse, followed by a forward to make sure
  -that the reverse is not being spoofed).  So for the highest performance
  -avoid using these directives (it's fine to use IP addresses rather than
  -domain names).
  -
  -<P>Note that it's possible to scope the directives, such as within
  -a <CODE>&lt;Location /server-status&gt;</CODE> section.  In this
  -case the DNS lookups are only performed on requests matching the
  -criteria.  Here's an example which disables
  -lookups except for .html and .cgi files:
  +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  +    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
   
  -<BLOCKQUOTE><PRE>
  +<html xmlns="http://www.w3.org/1999/xhtml">
  +  <head>
  +    <meta name="generator" content="HTML Tidy, see www.w3.org" />
  +
  +    <title>Apache Performance Notes</title>
  +  </head>
  +  <!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  +
  +  <body bgcolor="#FFFFFF" text="#000000" link="#0000FF"
  +  vlink="#000080" alink="#FF0000">
  +    <!--#include virtual="header.html" -->
  +
  +    <blockquote>
  +      <strong>Warning:</strong> This document has not been updated
  +      to take into account changes made in the 2.0 version of the
  +      Apache HTTP Server. Some of the information may still be
  +      relevant, but please use it with care.
  +    </blockquote>
  +
  +    <h1 align="center">Apache Performance Notes</h1>
  +
  +    <p>Author: Dean Gaudet</p>
  +
  +    <ul>
  +      <li><a href="#introduction">Introduction</a></li>
  +
  +      <li><a href="#hardware">Hardware and Operating System
  +      Issues</a></li>
  +
  +      <li><a href="#runtime">Run-Time Configuration Issues</a></li>
  +
  +      <li><a href="#compiletime">Compile-Time Configuration
  +      Issues</a></li>
  +
  +      <li>
  +        Appendixes 
  +
  +        <ul>
  +          <li><a href="#trace">Detailed Analysis of a
  +          Trace</a></li>
  +
  +          <li><a href="#patches">Patches Available</a></li>
  +
  +          <li><a href="#preforking">The Pre-Forking Model</a></li>
  +        </ul>
  +      </li>
  +    </ul>
  +    <hr />
  +
  +    <table border="1">
  +      <tr>
  +        <td valign="top"><strong>Related Modules</strong><br />
  +         <br />
  +         <a href="../mod/mod_dir.html">mod_dir</a><br />
  +         <a href="../mod/mpm_common.html">Multi-Processing
  +        module</a><br />
  +         <a href="../mod/mod_status.html">mod_status</a><br />
  +         </td>
  +
  +        <td valign="top"><strong>Related Directives</strong><br />
  +         <br />
  +         <a
  +        href="../mod/core.html#allowoverride">AllowOverride</a><br />
  +         <a
  +        href="../mod/mod_dir.html#directoryindex">DirectoryIndex</a><br />
  +         <a
  +        href="../mod/core.html#hostnamelookups">HostNameLookups</a><br />
  +         <a
  +        href="../mod/core.html#keepalivetimeout">KeepAliveTimeout</a><br />
  +         <a
  +        href="../mod/prefork.html#maxspareservers">MaxSpareServers</a><br />
  +         <a
  +        href="../mod/prefork.html#mixspareservers">MinSpareServers</a><br />
  +         <a href="../mod/core.html#options">Options</a>
  +        (FollowSymLinks and FollowIfOwnerMatch)<br />
  +         <a
  +        href="../mod/mpm_common.html#startservers">StartServers</a><br />
  +         </td>
  +      </tr>
  +    </table>
  +
  +    <h3><a id="introduction"
  +    name="introduction">Introduction</a></h3>
  +
  +    <p>Apache is a general webserver, which is designed to be
  +    correct first, and fast second. Even so, its performance is
  +    quite satisfactory. Most sites have less than 10Mbits of
  +    outgoing bandwidth, which Apache can fill using only a low end
  +    Pentium-based webserver. In practice sites with more bandwidth
  +    require more than one machine to fill the bandwidth due to
  +    other constraints (such as CGI or database transaction
  +    overhead). For these reasons the development focus has been
  +    mostly on correctness and configurability.</p>
  +
  +    <p>Unfortunately many folks overlook these facts and cite raw
  +    performance numbers as if they are some indication of the
  +    quality of a web server product. There is a bare minimum
  +    performance that is acceptable, beyond that extra speed only
  +    caters to a much smaller segment of the market. But in order to
  +    avoid this hurdle to the acceptance of Apache in some markets,
  +    effort was put into Apache 1.3 to bring performance up to a
  +    point where the difference with other high-end webservers is
  +    minimal.</p>
  +
  +    <p>Finally there are the folks who just plain want to see how
  +    fast something can go. The author falls into this category. The
  +    rest of this document is dedicated to these folks who want to
  +    squeeze every last bit of performance out of Apache's current
  +    model, and want to understand why it does some things which
  +    slow it down.</p>
  +
  +    <p>Note that this is tailored towards Apache 1.3 on Unix. Some
  +    of it applies to Apache on NT. Apache on NT has not been tuned
  +    for performance yet; in fact it probably performs very poorly
  +    because NT performance requires a different programming
  +    model.</p>
  +    <hr />
  +
  +    <h3><a id="hardware" name="hardware">Hardware and Operating
  +    System Issues</a></h3>
  +
  +    <p>The single biggest hardware issue affecting webserver
  +    performance is RAM. A webserver should never ever have to swap,
  +    swapping increases the latency of each request beyond a point
  +    that users consider "fast enough". This causes users to hit
  +    stop and reload, further increasing the load. You can, and
  +    should, control the <code>MaxClients</code> setting so that
  +    your server does not spawn so many children it starts
  +    swapping.</p>
  +
  +    <p>Beyond that the rest is mundane: get a fast enough CPU, a
  +    fast enough network card, and fast enough disks, where "fast
  +    enough" is something that needs to be determined by
  +    experimentation.</p>
  +
  +    <p>Operating system choice is largely a matter of local
  +    concerns. But a general guideline is to always apply the latest
  +    vendor TCP/IP patches. HTTP serving completely breaks many of
  +    the assumptions built into Unix kernels up through 1994 and
  +    even 1995. Good choices include recent FreeBSD, and Linux.</p>
  +    <hr />
  +
  +    <h3><a id="runtime" name="runtime">Run-Time Configuration
  +    Issues</a></h3>
  +
  +    <h4>HostnameLookups</h4>
  +
  +    <p>Prior to Apache 1.3, <code>HostnameLookups</code> defaulted
  +    to On. This adds latency to every request because it requires a
  +    DNS lookup to complete before the request is finished. In
  +    Apache 1.3 this setting defaults to Off. However (1.3 or
  +    later), if you use any <code>Allow from domain</code> or
  +    <code>Deny from domain</code> directives then you will pay for
  +    a double reverse DNS lookup (a reverse, followed by a forward
  +    to make sure that the reverse is not being spoofed). So for the
  +    highest performance avoid using these directives (it's fine to
  +    use IP addresses rather than domain names).</p>
  +
  +    <p>Note that it's possible to scope the directives, such as
  +    within a <code>&lt;Location /server-status&gt;</code> section.
  +    In this case the DNS lookups are only performed on requests
  +    matching the criteria. Here's an example which disables lookups
  +    except for .html and .cgi files:</p>
  +
  +    <blockquote>
  +<pre>
   HostnameLookups off
   &lt;Files ~ "\.(html|cgi)$"&gt;
       HostnameLookups on
   &lt;/Files&gt;
  -</PRE></BLOCKQUOTE>
  -
  -But even still, if you just need DNS names
  -in some CGIs you could consider doing the
  -<CODE>gethostbyname</CODE> call in the specific CGIs that need it.
  -
  -<p>Similarly, if you need to have hostname information in your server
  -logs in order to generate reports of this information, you can
  -postprocess your log file with <a href="../programs/logresolve.html"
  ->logresolve</a>, so that these lookups can be done without making the
  -client wait. It is recommended that you do this postprocessing, and any
  -other statistical analysis of the log file, somewhere other than your
  -production web server machine, in order that this activity does not
  -adversely affect server performance.</p>
  -
  -<H4>FollowSymLinks and SymLinksIfOwnerMatch</H4>
  -<P>Wherever in your URL-space you do not have an
  -<CODE>Options FollowSymLinks</CODE>, or you do have an
  -<CODE>Options SymLinksIfOwnerMatch</CODE> Apache will have to
  -issue extra system calls to check up on symlinks.  One extra call per
  -filename component.  For example, if you had:
  +</pre>
  +    </blockquote>
  +    But even still, if you just need DNS names in some CGIs you
  +    could consider doing the <code>gethostbyname</code> call in the
  +    specific CGIs that need it. 
  +
  +    <p>Similarly, if you need to have hostname information in your
  +    server logs in order to generate reports of this information,
  +    you can postprocess your log file with <a
  +    href="../programs/logresolve.html">logresolve</a>, so that
  +    these lookups can be done without making the client wait. It is
  +    recommended that you do this postprocessing, and any other
  +    statistical analysis of the log file, somewhere other than your
  +    production web server machine, in order that this activity does
  +    not adversely affect server performance.</p>
  +
  +    <h4>FollowSymLinks and SymLinksIfOwnerMatch</h4>
  +
  +    <p>Wherever in your URL-space you do not have an <code>Options
  +    FollowSymLinks</code>, or you do have an <code>Options
  +    SymLinksIfOwnerMatch</code> Apache will have to issue extra
  +    system calls to check up on symlinks. One extra call per
  +    filename component. For example, if you had:</p>
   
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   DocumentRoot /www/htdocs
   &lt;Directory /&gt;
       Options SymLinksIfOwnerMatch
   &lt;/Directory&gt;
  -</PRE></BLOCKQUOTE>
  -
  -and a request is made for the URI <CODE>/index.html</CODE>.
  -Then Apache will perform <CODE>lstat(2)</CODE> on <CODE>/www</CODE>,
  -<CODE>/www/htdocs</CODE>, and <CODE>/www/htdocs/index.html</CODE>.  The
  -results of these <CODE>lstats</CODE> are never cached,
  -so they will occur on every single request.  If you really desire the
  -symlinks security checking you can do something like this:
  +</pre>
  +    </blockquote>
  +    and a request is made for the URI <code>/index.html</code>.
  +    Then Apache will perform <code>lstat(2)</code> on
  +    <code>/www</code>, <code>/www/htdocs</code>, and
  +    <code>/www/htdocs/index.html</code>. The results of these
  +    <code>lstats</code> are never cached, so they will occur on
  +    every single request. If you really desire the symlinks
  +    security checking you can do something like this: 
   
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   DocumentRoot /www/htdocs
   &lt;Directory /&gt;
       Options FollowSymLinks
  @@ -184,440 +220,486 @@
   &lt;Directory /www/htdocs&gt;
       Options -FollowSymLinks +SymLinksIfOwnerMatch
   &lt;/Directory&gt;
  -</PRE></BLOCKQUOTE>
  +</pre>
  +    </blockquote>
  +    This at least avoids the extra checks for the
  +    <code>DocumentRoot</code> path. Note that you'll need to add
  +    similar sections if you have any <code>Alias</code> or
  +    <code>RewriteRule</code> paths outside of your document root.
  +    For highest performance, and no symlink protection, set
  +    <code>FollowSymLinks</code> everywhere, and never set
  +    <code>SymLinksIfOwnerMatch</code>. 
  +
  +    <h4>AllowOverride</h4>
  +
  +    <p>Wherever in your URL-space you allow overrides (typically
  +    <code>.htaccess</code> files) Apache will attempt to open
  +    <code>.htaccess</code> for each filename component. For
  +    example,</p>
   
  -This at least avoids the extra checks for the <CODE>DocumentRoot</CODE>
  -path.  Note that you'll need to add similar sections if you have any
  -<CODE>Alias</CODE> or <CODE>RewriteRule</CODE> paths outside of your
  -document root.  For highest performance, and no symlink protection,
  -set <CODE>FollowSymLinks</CODE> everywhere, and never set
  -<CODE>SymLinksIfOwnerMatch</CODE>.
  -
  -<H4>AllowOverride</H4>
  -
  -<P>Wherever in your URL-space you allow overrides (typically
  -<CODE>.htaccess</CODE> files) Apache will attempt to open
  -<CODE>.htaccess</CODE> for each filename component.  For example,
  -
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   DocumentRoot /www/htdocs
   &lt;Directory /&gt;
       AllowOverride all
   &lt;/Directory&gt;
  -</PRE></BLOCKQUOTE>
  -
  -and a request is made for the URI <CODE>/index.html</CODE>.  Then
  -Apache will attempt to open <CODE>/.htaccess</CODE>,
  -<CODE>/www/.htaccess</CODE>, and <CODE>/www/htdocs/.htaccess</CODE>.
  -The solutions are similar to the previous case of <CODE>Options
  -FollowSymLinks</CODE>.  For highest performance use
  -<CODE>AllowOverride None</CODE> everywhere in your filesystem.
  -
  -<H4>Negotiation</H4>
  -
  -<P>If at all possible, avoid content-negotiation if you're really
  -interested in every last ounce of performance.  In practice the
  -benefits of negotiation outweigh the performance penalties.  There's
  -one case where you can speed up the server.  Instead of using
  -a wildcard such as:
  +</pre>
  +    </blockquote>
  +    and a request is made for the URI <code>/index.html</code>.
  +    Then Apache will attempt to open <code>/.htaccess</code>,
  +    <code>/www/.htaccess</code>, and
  +    <code>/www/htdocs/.htaccess</code>. The solutions are similar
  +    to the previous case of <code>Options FollowSymLinks</code>.
  +    For highest performance use <code>AllowOverride None</code>
  +    everywhere in your filesystem. 
  +
  +    <h4>Negotiation</h4>
  +
  +    <p>If at all possible, avoid content-negotiation if you're
  +    really interested in every last ounce of performance. In
  +    practice the benefits of negotiation outweigh the performance
  +    penalties. There's one case where you can speed up the server.
  +    Instead of using a wildcard such as:</p>
   
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   DirectoryIndex index
  -</PRE></BLOCKQUOTE>
  +</pre>
  +    </blockquote>
  +    Use a complete list of options: 
   
  -Use a complete list of options:
  -
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   DirectoryIndex index.cgi index.pl index.shtml index.html
  -</PRE></BLOCKQUOTE>
  -
  -where you list the most common choice first.
  -
  -<p>Also note that explicitly creating a <code>type-map</code> file
  -provides better performance than using <code>MultiViews</code>, as the
  -necessary information can be determined by reading this single file,
  -rather than having to scan the directory for files.</p>
  -
  -<H4>Process Creation</H4>
  -
  -<P>Prior to Apache 1.3 the <CODE>MinSpareServers</CODE>,
  -<CODE>MaxSpareServers</CODE>, and <CODE>StartServers</CODE> settings
  -all had drastic effects on benchmark results.  In particular, Apache
  -required a "ramp-up" period in order to reach a number of children
  -sufficient to serve the load being applied.  After the initial
  -spawning of <CODE>StartServers</CODE> children, only one child per
  -second would be created to satisfy the <CODE>MinSpareServers</CODE>
  -setting.  So a server being accessed by 100 simultaneous clients,
  -using the default <CODE>StartServers</CODE> of 5 would take on
  -the order 95 seconds to spawn enough children to handle the load.  This
  -works fine in practice on real-life servers, because they aren't restarted
  -frequently.  But does really poorly on benchmarks which might only run
  -for ten minutes.
  -
  -<P>The one-per-second rule was implemented in an effort to avoid
  -swamping the machine with the startup of new children.  If the machine
  -is busy spawning children it can't service requests.  But it has such
  -a drastic effect on the perceived performance of Apache that it had
  -to be replaced.  As of Apache 1.3,
  -the code will relax the one-per-second rule.  It
  -will spawn one, wait a second, then spawn two, wait a second, then spawn
  -four, and it will continue exponentially until it is spawning 32 children
  -per second.  It will stop whenever it satisfies the
  -<CODE>MinSpareServers</CODE> setting.
  -
  -<P>This appears to be responsive enough that it's
  -almost unnecessary to twiddle the <CODE>MinSpareServers</CODE>,
  -<CODE>MaxSpareServers</CODE> and <CODE>StartServers</CODE> knobs.  When
  -more than 4 children are spawned per second, a message will be emitted
  -to the <CODE>ErrorLog</CODE>.  If you see a lot of these errors then
  -consider tuning these settings.  Use the <CODE>mod_status</CODE> output
  -as a guide.
  -
  -<P>Related to process creation is process death induced by the
  -<CODE>MaxRequestsPerChild</CODE> setting.  By default this is 0, which
  -means that there is no limit to the number of requests handled
  -per child. If your configuration currently has this set to some
  -very low number, such as 30, you may want to bump this up significantly.
  -If you are running SunOS or an old version of Solaris, limit this
  -to 10000 or so because of memory leaks.
  -
  -<P>When keep-alives are in use, children will be kept busy
  -doing nothing waiting for more requests on the already open
  -connection.  The default <CODE>KeepAliveTimeout</CODE> of
  -15 seconds attempts to minimize this effect.  The tradeoff
  -here is between network bandwidth and server resources.
  -In no event should you raise this above about 60 seconds, as
  -<A HREF="http://www.research.digital.com/wrl/techreports/abstracts/95.4.html"
  ->most of the benefits are lost</A>.
  -
  -<hr>
  -
  -<H3><a name="compiletime">Compile-Time Configuration Issues</a></H3>
  -
  -<H4>mod_status and ExtendedStatus On</H4>
  -
  -<P>If you include <CODE>mod_status</CODE>
  -and you also set <CODE>ExtendedStatus On</CODE> when building and running
  -Apache, then on every request Apache will perform two calls to
  -<CODE>gettimeofday(2)</CODE> (or <CODE>times(2)</CODE> depending
  -on your operating system), and (pre-1.3) several extra calls to
  -<CODE>time(2)</CODE>.  This is all done so that the status report
  -contains timing indications.  For highest performance, set
  -<CODE>ExtendedStatus off</CODE> (which is the default).
  -
  -<H4>accept Serialization - multiple sockets</H4>
  -
  -<P>This discusses a shortcoming in the Unix socket API.
  -Suppose your
  -web server uses multiple <CODE>Listen</CODE> statements to listen on
  -either multiple ports or multiple addresses.  In order to test each
  -socket to see if a connection is ready Apache uses <CODE>select(2)</CODE>.
  -<CODE>select(2)</CODE> indicates that a socket has <EM>zero</EM> or
  -<EM>at least one</EM> connection waiting on it.  Apache's model includes
  -multiple children, and all the idle ones test for new connections at the
  -same time.  A naive implementation looks something like this
  -(these examples do not match the code, they're contrived for
  -pedagogical purposes):
  +</pre>
  +    </blockquote>
  +    where you list the most common choice first. 
  +
  +    <p>Also note that explicitly creating a <code>type-map</code>
  +    file provides better performance than using
  +    <code>MultiViews</code>, as the necessary information can be
  +    determined by reading this single file, rather than having to
  +    scan the directory for files.</p>
  +
  +    <h4>Process Creation</h4>
  +
  +    <p>Prior to Apache 1.3 the <code>MinSpareServers</code>,
  +    <code>MaxSpareServers</code>, and <code>StartServers</code>
  +    settings all had drastic effects on benchmark results. In
  +    particular, Apache required a "ramp-up" period in order to
  +    reach a number of children sufficient to serve the load being
  +    applied. After the initial spawning of
  +    <code>StartServers</code> children, only one child per second
  +    would be created to satisfy the <code>MinSpareServers</code>
  +    setting. So a server being accessed by 100 simultaneous
  +    clients, using the default <code>StartServers</code> of 5 would
  +    take on the order 95 seconds to spawn enough children to handle
  +    the load. This works fine in practice on real-life servers,
  +    because they aren't restarted frequently. But does really
  +    poorly on benchmarks which might only run for ten minutes.</p>
  +
  +    <p>The one-per-second rule was implemented in an effort to
  +    avoid swamping the machine with the startup of new children. If
  +    the machine is busy spawning children it can't service
  +    requests. But it has such a drastic effect on the perceived
  +    performance of Apache that it had to be replaced. As of Apache
  +    1.3, the code will relax the one-per-second rule. It will spawn
  +    one, wait a second, then spawn two, wait a second, then spawn
  +    four, and it will continue exponentially until it is spawning
  +    32 children per second. It will stop whenever it satisfies the
  +    <code>MinSpareServers</code> setting.</p>
  +
  +    <p>This appears to be responsive enough that it's almost
  +    unnecessary to twiddle the <code>MinSpareServers</code>,
  +    <code>MaxSpareServers</code> and <code>StartServers</code>
  +    knobs. When more than 4 children are spawned per second, a
  +    message will be emitted to the <code>ErrorLog</code>. If you
  +    see a lot of these errors then consider tuning these settings.
  +    Use the <code>mod_status</code> output as a guide.</p>
  +
  +    <p>Related to process creation is process death induced by the
  +    <code>MaxRequestsPerChild</code> setting. By default this is 0,
  +    which means that there is no limit to the number of requests
  +    handled per child. If your configuration currently has this set
  +    to some very low number, such as 30, you may want to bump this
  +    up significantly. If you are running SunOS or an old version of
  +    Solaris, limit this to 10000 or so because of memory leaks.</p>
  +
  +    <p>When keep-alives are in use, children will be kept busy
  +    doing nothing waiting for more requests on the already open
  +    connection. The default <code>KeepAliveTimeout</code> of 15
  +    seconds attempts to minimize this effect. The tradeoff here is
  +    between network bandwidth and server resources. In no event
  +    should you raise this above about 60 seconds, as <a
  +    href="http://www.research.digital.com/wrl/techreports/abstracts/95.4.html">
  +    most of the benefits are lost</a>.</p>
  +    <hr />
  +
  +    <h3><a id="compiletime" name="compiletime">Compile-Time
  +    Configuration Issues</a></h3>
  +
  +    <h4>mod_status and ExtendedStatus On</h4>
  +
  +    <p>If you include <code>mod_status</code> and you also set
  +    <code>ExtendedStatus On</code> when building and running
  +    Apache, then on every request Apache will perform two calls to
  +    <code>gettimeofday(2)</code> (or <code>times(2)</code>
  +    depending on your operating system), and (pre-1.3) several
  +    extra calls to <code>time(2)</code>. This is all done so that
  +    the status report contains timing indications. For highest
  +    performance, set <code>ExtendedStatus off</code> (which is the
  +    default).</p>
  +
  +    <h4>accept Serialization - multiple sockets</h4>
  +
  +    <p>This discusses a shortcoming in the Unix socket API. Suppose
  +    your web server uses multiple <code>Listen</code> statements to
  +    listen on either multiple ports or multiple addresses. In order
  +    to test each socket to see if a connection is ready Apache uses
  +    <code>select(2)</code>. <code>select(2)</code> indicates that a
  +    socket has <em>zero</em> or <em>at least one</em> connection
  +    waiting on it. Apache's model includes multiple children, and
  +    all the idle ones test for new connections at the same time. A
  +    naive implementation looks something like this (these examples
  +    do not match the code, they're contrived for pedagogical
  +    purposes):</p>
   
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
       for (;;) {
  -	for (;;) {
  -	    fd_set accept_fds;
  +    for (;;) {
  +        fd_set accept_fds;
   
  -	    FD_ZERO (&amp;accept_fds);
  -	    for (i = first_socket; i &lt;= last_socket; ++i) {
  -		FD_SET (i, &amp;accept_fds);
  -	    }
  -	    rc = select (last_socket+1, &amp;accept_fds, NULL, NULL, NULL);
  -	    if (rc &lt; 1) continue;
  -	    new_connection = -1;
  -	    for (i = first_socket; i &lt;= last_socket; ++i) {
  -		if (FD_ISSET (i, &amp;accept_fds)) {
  -		    new_connection = accept (i, NULL, NULL);
  -		    if (new_connection != -1) break;
  -		}
  -	    }
  -	    if (new_connection != -1) break;
  -	}
  -	process the new_connection;
  +        FD_ZERO (&amp;accept_fds);
  +        for (i = first_socket; i &lt;= last_socket; ++i) {
  +        FD_SET (i, &amp;accept_fds);
  +        }
  +        rc = select (last_socket+1, &amp;accept_fds, NULL, NULL, NULL);
  +        if (rc &lt; 1) continue;
  +        new_connection = -1;
  +        for (i = first_socket; i &lt;= last_socket; ++i) {
  +        if (FD_ISSET (i, &amp;accept_fds)) {
  +            new_connection = accept (i, NULL, NULL);
  +            if (new_connection != -1) break;
  +        }
  +        }
  +        if (new_connection != -1) break;
       }
  -</PRE></BLOCKQUOTE>
  -
  -But this naive implementation has a serious starvation problem.  Recall
  -that multiple children execute this loop at the same time, and so multiple
  -children will block at <CODE>select</CODE> when they are in between
  -requests.  All those blocked children will awaken and return from
  -<CODE>select</CODE> when a single request appears on any socket
  -(the number of children which awaken varies depending on the operating
  -system and timing issues).
  -They will all then fall down into the loop and try to <CODE>accept</CODE>
  -the connection.  But only one will succeed (assuming there's still only
  -one connection ready), the rest will be <EM>blocked</EM> in
  -<CODE>accept</CODE>.
  -This effectively locks those children into serving requests from that
  -one socket and no other sockets, and they'll be stuck there until enough
  -new requests appear on that socket to wake them all up.
  -This starvation problem was first documented in
  -<A HREF="http://bugs.apache.org/index/full/467">PR#467</A>.  There
  -are at least two solutions.
  -
  -<P>One solution is to make the sockets non-blocking.  In this case the
  -<CODE>accept</CODE> won't block the children, and they will be allowed
  -to continue immediately.  But this wastes CPU time.  Suppose you have
  -ten idle children in <CODE>select</CODE>, and one connection arrives.
  -Then nine of those children will wake up, try to <CODE>accept</CODE> the
  -connection, fail, and loop back into <CODE>select</CODE>, accomplishing
  -nothing.  Meanwhile none of those children are servicing requests that
  -occurred on other sockets until they get back up to the <CODE>select</CODE>
  -again.  Overall this solution does not seem very fruitful unless you
  -have as many idle CPUs (in a multiprocessor box) as you have idle children,
  -not a very likely situation.
  -
  -<P>Another solution, the one used by Apache, is to serialize entry into
  -the inner loop.  The loop looks like this (differences highlighted):
  +    process the new_connection;
  +    }
  +</pre>
  +    </blockquote>
  +    But this naive implementation has a serious starvation problem.
  +    Recall that multiple children execute this loop at the same
  +    time, and so multiple children will block at
  +    <code>select</code> when they are in between requests. All
  +    those blocked children will awaken and return from
  +    <code>select</code> when a single request appears on any socket
  +    (the number of children which awaken varies depending on the
  +    operating system and timing issues). They will all then fall
  +    down into the loop and try to <code>accept</code> the
  +    connection. But only one will succeed (assuming there's still
  +    only one connection ready), the rest will be <em>blocked</em>
  +    in <code>accept</code>. This effectively locks those children
  +    into serving requests from that one socket and no other
  +    sockets, and they'll be stuck there until enough new requests
  +    appear on that socket to wake them all up. This starvation
  +    problem was first documented in <a
  +    href="http://bugs.apache.org/index/full/467">PR#467</a>. There
  +    are at least two solutions. 
  +
  +    <p>One solution is to make the sockets non-blocking. In this
  +    case the <code>accept</code> won't block the children, and they
  +    will be allowed to continue immediately. But this wastes CPU
  +    time. Suppose you have ten idle children in
  +    <code>select</code>, and one connection arrives. Then nine of
  +    those children will wake up, try to <code>accept</code> the
  +    connection, fail, and loop back into <code>select</code>,
  +    accomplishing nothing. Meanwhile none of those children are
  +    servicing requests that occurred on other sockets until they
  +    get back up to the <code>select</code> again. Overall this
  +    solution does not seem very fruitful unless you have as many
  +    idle CPUs (in a multiprocessor box) as you have idle children,
  +    not a very likely situation.</p>
  +
  +    <p>Another solution, the one used by Apache, is to serialize
  +    entry into the inner loop. The loop looks like this
  +    (differences highlighted):</p>
   
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
       for (;;) {
  -	<STRONG>accept_mutex_on ();</STRONG>
  -	for (;;) {
  -	    fd_set accept_fds;
  -
  -	    FD_ZERO (&amp;accept_fds);
  -	    for (i = first_socket; i &lt;= last_socket; ++i) {
  -		FD_SET (i, &amp;accept_fds);
  -	    }
  -	    rc = select (last_socket+1, &amp;accept_fds, NULL, NULL, NULL);
  -	    if (rc &lt; 1) continue;
  -	    new_connection = -1;
  -	    for (i = first_socket; i &lt;= last_socket; ++i) {
  -		if (FD_ISSET (i, &amp;accept_fds)) {
  -		    new_connection = accept (i, NULL, NULL);
  -		    if (new_connection != -1) break;
  -		}
  -	    }
  -	    if (new_connection != -1) break;
  -	}
  -	<STRONG>accept_mutex_off ();</STRONG>
  -	process the new_connection;
  -    }
  -</PRE></BLOCKQUOTE>
  +    <strong>accept_mutex_on ();</strong>
  +    for (;;) {
  +        fd_set accept_fds;
   
  -<A NAME="serialize">The functions</A>
  -<CODE>accept_mutex_on</CODE> and <CODE>accept_mutex_off</CODE>
  -implement a mutual exclusion semaphore.  Only one child can have the
  -mutex at any time.  There are several choices for implementing these
  -mutexes.  The choice is defined in <CODE>src/conf.h</CODE> (pre-1.3) or
  -<CODE>src/include/ap_config.h</CODE> (1.3 or later).  Some architectures
  -do not have any locking choice made, on these architectures it is unsafe
  -to use multiple <CODE>Listen</CODE> directives.
  -
  -<DL>
  -<DT><CODE>USE_FLOCK_SERIALIZED_ACCEPT</CODE>
  -<DD>This method uses the <CODE>flock(2)</CODE> system call to lock a
  -lock file (located by the <CODE>LockFile</CODE> directive).
  -
  -<DT><CODE>USE_FCNTL_SERIALIZED_ACCEPT</CODE>
  -<DD>This method uses the <CODE>fcntl(2)</CODE> system call to lock a
  -lock file (located by the <CODE>LockFile</CODE> directive).
  -
  -<DT><CODE>USE_SYSVSEM_SERIALIZED_ACCEPT</CODE>
  -<DD>(1.3 or later) This method uses SysV-style semaphores to implement the
  -mutex.  Unfortunately SysV-style semaphores have some bad side-effects.
  -One is that it's possible Apache will die without cleaning up the semaphore
  -(see the <CODE>ipcs(8)</CODE> man page).  The other is that the semaphore
  -API allows for a denial of service attack by any CGIs running under the
  -same uid as the webserver (<EM>i.e.</EM>, all CGIs, unless you use something
  -like suexec or cgiwrapper).  For these reasons this method is not used
  -on any architecture except IRIX (where the previous two are prohibitively
  -expensive on most IRIX boxes).
  -
  -<DT><CODE>USE_USLOCK_SERIALIZED_ACCEPT</CODE>
  -<DD>(1.3 or later) This method is only available on IRIX, and uses
  -<CODE>usconfig(2)</CODE> to create a mutex.  While this method avoids
  -the hassles of SysV-style semaphores, it is not the default for IRIX.
  -This is because on single processor IRIX boxes (5.3 or 6.2) the
  -uslock code is two orders of magnitude slower than the SysV-semaphore
  -code.  On multi-processor IRIX boxes the uslock code is an order of magnitude
  -faster than the SysV-semaphore code.  Kind of a messed up situation.
  -So if you're using a multiprocessor IRIX box then you should rebuild your
  -webserver with <CODE>-DUSE_USLOCK_SERIALIZED_ACCEPT</CODE> on the
  -<CODE>EXTRA_CFLAGS</CODE>.
  -
  -<DT><CODE>USE_PTHREAD_SERIALIZED_ACCEPT</CODE>
  -<DD>(1.3 or later) This method uses POSIX mutexes and should work on
  -any architecture implementing the full POSIX threads specification,
  -however appears to only work on Solaris (2.5 or later), and even then
  -only in certain configurations.  If you experiment with this you should
  -watch out for your server hanging and not responding.  Static content
  -only servers may work just fine.
  -</DL>
  -
  -<P>If your system has another method of serialization which isn't in the
  -above list then it may be worthwhile adding code for it (and submitting
  -a patch back to Apache).
  -
  -<P>Another solution that has been considered but never implemented is
  -to partially serialize the loop -- that is, let in a certain number
  -of processes.  This would only be of interest on multiprocessor boxes
  -where it's possible multiple children could run simultaneously, and the
  -serialization actually doesn't take advantage of the full bandwidth.
  -This is a possible area of future investigation, but priority remains
  -low because highly parallel web servers are not the norm.
  -
  -<P>Ideally you should run servers without multiple <CODE>Listen</CODE>
  -statements if you want the highest performance.  But read on.
  -
  -<H4>accept Serialization - single socket</H4>
  -
  -<P>The above is fine and dandy for multiple socket servers, but what
  -about single socket servers?  In theory they shouldn't experience
  -any of these same problems because all children can just block in
  -<CODE>accept(2)</CODE> until a connection arrives, and no starvation
  -results.  In practice this hides almost the same "spinning" behaviour
  -discussed above in the non-blocking solution.  The way that most TCP
  -stacks are implemented, the kernel actually wakes up all processes blocked
  -in <CODE>accept</CODE> when a single connection arrives.  One of those
  -processes gets the connection and returns to user-space, the rest spin in
  -the kernel and go back to sleep when they discover there's no connection
  -for them.  This spinning is hidden from the user-land code, but it's
  -there nonetheless.  This can result in the same load-spiking wasteful
  -behaviour that a non-blocking solution to the multiple sockets case can.
  -
  -<P>For this reason we have found that many architectures behave more
  -"nicely" if we serialize even the single socket case.  So this is
  -actually the default in almost all cases.  Crude experiments under
  -Linux (2.0.30 on a dual Pentium pro 166 w/128Mb RAM) have shown that
  -the serialization of the single socket case causes less than a 3%
  -decrease in requests per second over unserialized single-socket.
  -But unserialized single-socket showed an extra 100ms latency on
  -each request.  This latency is probably a wash on long haul lines,
  -and only an issue on LANs.  If you want to override the single socket
  -serialization you can define <CODE>SINGLE_LISTEN_UNSERIALIZED_ACCEPT</CODE>
  -and then single-socket servers will not serialize at all.
  -
  -<H4>Lingering Close</H4>
  -
  -<P>As discussed in
  -<A
  - HREF="http://www.ics.uci.edu/pub/ietf/http/draft-ietf-http-connection-00.txt"
  ->draft-ietf-http-connection-00.txt</A> section 8,
  -in order for an HTTP server to <STRONG>reliably</STRONG> implement the protocol
  -it needs to shutdown each direction of the communication independently
  -(recall that a TCP connection is bi-directional, each half is independent
  -of the other).  This fact is often overlooked by other servers, but
  -is correctly implemented in Apache as of 1.2.
  -
  -<P>When this feature was added to Apache it caused a flurry of
  -problems on various versions of Unix because of a shortsightedness.
  -The TCP specification does not state that the FIN_WAIT_2 state has a
  -timeout, but it doesn't prohibit it.  On systems without the timeout,
  -Apache 1.2 induces many sockets stuck forever in the FIN_WAIT_2 state.
  -In many cases this can be avoided by simply upgrading to the latest
  -TCP/IP patches supplied by the vendor.  In cases where the vendor has
  -never released patches (<EM>i.e.</EM>,  SunOS4 -- although folks with a source
  -license can patch it themselves) we have decided to disable this feature.
  -
  -<P>There are two ways of accomplishing this.  One is the
  -socket option <CODE>SO_LINGER</CODE>.  But as fate would have it,
  -this has never been implemented properly in most TCP/IP stacks.  Even
  -on those stacks with a proper implementation (<EM>i.e.</EM>, Linux 2.0.31) this
  -method proves to be more expensive (cputime) than the next solution.
  -
  -<P>For the most part, Apache implements this in a function called
  -<CODE>lingering_close</CODE> (in <CODE>http_main.c</CODE>).  The
  -function looks roughly like this:
  +        FD_ZERO (&amp;accept_fds);
  +        for (i = first_socket; i &lt;= last_socket; ++i) {
  +        FD_SET (i, &amp;accept_fds);
  +        }
  +        rc = select (last_socket+1, &amp;accept_fds, NULL, NULL, NULL);
  +        if (rc &lt; 1) continue;
  +        new_connection = -1;
  +        for (i = first_socket; i &lt;= last_socket; ++i) {
  +        if (FD_ISSET (i, &amp;accept_fds)) {
  +            new_connection = accept (i, NULL, NULL);
  +            if (new_connection != -1) break;
  +        }
  +        }
  +        if (new_connection != -1) break;
  +    }
  +    <strong>accept_mutex_off ();</strong>
  +    process the new_connection;
  +    }
  +</pre>
  +    </blockquote>
  +    <a id="serialize" name="serialize">The functions</a>
  +    <code>accept_mutex_on</code> and <code>accept_mutex_off</code>
  +    implement a mutual exclusion semaphore. Only one child can have
  +    the mutex at any time. There are several choices for
  +    implementing these mutexes. The choice is defined in
  +    <code>src/conf.h</code> (pre-1.3) or
  +    <code>src/include/ap_config.h</code> (1.3 or later). Some
  +    architectures do not have any locking choice made, on these
  +    architectures it is unsafe to use multiple <code>Listen</code>
  +    directives. 
  +
  +    <dl>
  +      <dt><code>USE_FLOCK_SERIALIZED_ACCEPT</code></dt>
  +
  +      <dd>This method uses the <code>flock(2)</code> system call to
  +      lock a lock file (located by the <code>LockFile</code>
  +      directive).</dd>
  +
  +      <dt><code>USE_FCNTL_SERIALIZED_ACCEPT</code></dt>
  +
  +      <dd>This method uses the <code>fcntl(2)</code> system call to
  +      lock a lock file (located by the <code>LockFile</code>
  +      directive).</dd>
  +
  +      <dt><code>USE_SYSVSEM_SERIALIZED_ACCEPT</code></dt>
  +
  +      <dd>(1.3 or later) This method uses SysV-style semaphores to
  +      implement the mutex. Unfortunately SysV-style semaphores have
  +      some bad side-effects. One is that it's possible Apache will
  +      die without cleaning up the semaphore (see the
  +      <code>ipcs(8)</code> man page). The other is that the
  +      semaphore API allows for a denial of service attack by any
  +      CGIs running under the same uid as the webserver
  +      (<em>i.e.</em>, all CGIs, unless you use something like
  +      suexec or cgiwrapper). For these reasons this method is not
  +      used on any architecture except IRIX (where the previous two
  +      are prohibitively expensive on most IRIX boxes).</dd>
  +
  +      <dt><code>USE_USLOCK_SERIALIZED_ACCEPT</code></dt>
  +
  +      <dd>(1.3 or later) This method is only available on IRIX, and
  +      uses <code>usconfig(2)</code> to create a mutex. While this
  +      method avoids the hassles of SysV-style semaphores, it is not
  +      the default for IRIX. This is because on single processor
  +      IRIX boxes (5.3 or 6.2) the uslock code is two orders of
  +      magnitude slower than the SysV-semaphore code. On
  +      multi-processor IRIX boxes the uslock code is an order of
  +      magnitude faster than the SysV-semaphore code. Kind of a
  +      messed up situation. So if you're using a multiprocessor IRIX
  +      box then you should rebuild your webserver with
  +      <code>-DUSE_USLOCK_SERIALIZED_ACCEPT</code> on the
  +      <code>EXTRA_CFLAGS</code>.</dd>
  +
  +      <dt><code>USE_PTHREAD_SERIALIZED_ACCEPT</code></dt>
  +
  +      <dd>(1.3 or later) This method uses POSIX mutexes and should
  +      work on any architecture implementing the full POSIX threads
  +      specification, however appears to only work on Solaris (2.5
  +      or later), and even then only in certain configurations. If
  +      you experiment with this you should watch out for your server
  +      hanging and not responding. Static content only servers may
  +      work just fine.</dd>
  +    </dl>
  +
  +    <p>If your system has another method of serialization which
  +    isn't in the above list then it may be worthwhile adding code
  +    for it (and submitting a patch back to Apache).</p>
  +
  +    <p>Another solution that has been considered but never
  +    implemented is to partially serialize the loop -- that is, let
  +    in a certain number of processes. This would only be of
  +    interest on multiprocessor boxes where it's possible multiple
  +    children could run simultaneously, and the serialization
  +    actually doesn't take advantage of the full bandwidth. This is
  +    a possible area of future investigation, but priority remains
  +    low because highly parallel web servers are not the norm.</p>
  +
  +    <p>Ideally you should run servers without multiple
  +    <code>Listen</code> statements if you want the highest
  +    performance. But read on.</p>
  +
  +    <h4>accept Serialization - single socket</h4>
  +
  +    <p>The above is fine and dandy for multiple socket servers, but
  +    what about single socket servers? In theory they shouldn't
  +    experience any of these same problems because all children can
  +    just block in <code>accept(2)</code> until a connection
  +    arrives, and no starvation results. In practice this hides
  +    almost the same "spinning" behaviour discussed above in the
  +    non-blocking solution. The way that most TCP stacks are
  +    implemented, the kernel actually wakes up all processes blocked
  +    in <code>accept</code> when a single connection arrives. One of
  +    those processes gets the connection and returns to user-space,
  +    the rest spin in the kernel and go back to sleep when they
  +    discover there's no connection for them. This spinning is
  +    hidden from the user-land code, but it's there nonetheless.
  +    This can result in the same load-spiking wasteful behaviour
  +    that a non-blocking solution to the multiple sockets case
  +    can.</p>
  +
  +    <p>For this reason we have found that many architectures behave
  +    more "nicely" if we serialize even the single socket case. So
  +    this is actually the default in almost all cases. Crude
  +    experiments under Linux (2.0.30 on a dual Pentium pro 166
  +    w/128Mb RAM) have shown that the serialization of the single
  +    socket case causes less than a 3% decrease in requests per
  +    second over unserialized single-socket. But unserialized
  +    single-socket showed an extra 100ms latency on each request.
  +    This latency is probably a wash on long haul lines, and only an
  +    issue on LANs. If you want to override the single socket
  +    serialization you can define
  +    <code>SINGLE_LISTEN_UNSERIALIZED_ACCEPT</code> and then
  +    single-socket servers will not serialize at all.</p>
  +
  +    <h4>Lingering Close</h4>
  +
  +    <p>As discussed in <a
  +    href="http://www.ics.uci.edu/pub/ietf/http/draft-ietf-http-connection-00.txt">
  +    draft-ietf-http-connection-00.txt</a> section 8, in order for
  +    an HTTP server to <strong>reliably</strong> implement the
  +    protocol it needs to shutdown each direction of the
  +    communication independently (recall that a TCP connection is
  +    bi-directional, each half is independent of the other). This
  +    fact is often overlooked by other servers, but is correctly
  +    implemented in Apache as of 1.2.</p>
  +
  +    <p>When this feature was added to Apache it caused a flurry of
  +    problems on various versions of Unix because of a
  +    shortsightedness. The TCP specification does not state that the
  +    FIN_WAIT_2 state has a timeout, but it doesn't prohibit it. On
  +    systems without the timeout, Apache 1.2 induces many sockets
  +    stuck forever in the FIN_WAIT_2 state. In many cases this can
  +    be avoided by simply upgrading to the latest TCP/IP patches
  +    supplied by the vendor. In cases where the vendor has never
  +    released patches (<em>i.e.</em>, SunOS4 -- although folks with
  +    a source license can patch it themselves) we have decided to
  +    disable this feature.</p>
  +
  +    <p>There are two ways of accomplishing this. One is the socket
  +    option <code>SO_LINGER</code>. But as fate would have it, this
  +    has never been implemented properly in most TCP/IP stacks. Even
  +    on those stacks with a proper implementation (<em>i.e.</em>,
  +    Linux 2.0.31) this method proves to be more expensive (cputime)
  +    than the next solution.</p>
  +
  +    <p>For the most part, Apache implements this in a function
  +    called <code>lingering_close</code> (in
  +    <code>http_main.c</code>). The function looks roughly like
  +    this:</p>
   
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
       void lingering_close (int s)
       {
  -	char junk_buffer[2048];
  -
  -	/* shutdown the sending side */
  -	shutdown (s, 1);
  +    char junk_buffer[2048];
   
  -	signal (SIGALRM, lingering_death);
  -	alarm (30);
  +    /* shutdown the sending side */
  +    shutdown (s, 1);
   
  -	for (;;) {
  -	    select (s for reading, 2 second timeout);
  -	    if (error) break;
  -	    if (s is ready for reading) {
  -		if (read (s, junk_buffer, sizeof (junk_buffer)) &lt;= 0) {
  -		    break;
  -		}
  -		/* just toss away whatever is here */
  -	    }
  -	}
  +    signal (SIGALRM, lingering_death);
  +    alarm (30);
   
  -	close (s);
  +    for (;;) {
  +        select (s for reading, 2 second timeout);
  +        if (error) break;
  +        if (s is ready for reading) {
  +        if (read (s, junk_buffer, sizeof (junk_buffer)) &lt;= 0) {
  +            break;
  +        }
  +        /* just toss away whatever is here */
  +        }
       }
  -</PRE></BLOCKQUOTE>
  -
  -This naturally adds some expense at the end of a connection, but it
  -is required for a reliable implementation.  As HTTP/1.1 becomes more
  -prevalent, and all connections are persistent, this expense will be
  -amortized over more requests.  If you want to play with fire and
  -disable this feature you can define <CODE>NO_LINGCLOSE</CODE>, but
  -this is not recommended at all.  In particular, as HTTP/1.1 pipelined
  -persistent connections come into use <CODE>lingering_close</CODE>
  -is an absolute necessity (and
  -<A HREF="http://www.w3.org/Protocols/HTTP/Performance/Pipeline.html">
  -pipelined connections are faster</A>, so you
  -want to support them).
  -
  -<H4>Scoreboard File</H4>
  -
  -<P>Apache's parent and children communicate with each other through
  -something called the scoreboard.  Ideally this should be implemented
  -in shared memory.  For those operating systems that we either have
  -access to, or have been given detailed ports for, it typically is
  -implemented using shared memory.  The rest default to using an
  -on-disk file.  The on-disk file is not only slow, but it is unreliable
  -(and less featured).  Peruse the <CODE>src/main/conf.h</CODE> file
  -for your architecture and look for either <CODE>USE_MMAP_SCOREBOARD</CODE> or
  -<CODE>USE_SHMGET_SCOREBOARD</CODE>.  Defining one of those two (as
  -well as their companions <CODE>HAVE_MMAP</CODE> and <CODE>HAVE_SHMGET</CODE>
  -respectively) enables the supplied shared memory code.  If your system has
  -another type of shared memory, edit the file <CODE>src/main/http_main.c</CODE>
  -and add the hooks necessary to use it in Apache.  (Send us back a patch
  -too please.)
  -
  -<P>Historical note:  The Linux port of Apache didn't start to use
  -shared memory until version 1.2 of Apache.  This oversight resulted
  -in really poor and unreliable behaviour of earlier versions of Apache
  -on Linux.
  -
  -<H4><CODE>DYNAMIC_MODULE_LIMIT</CODE></H4>
  -
  -<P>If you have no intention of using dynamically loaded modules
  -(you probably don't if you're reading this and tuning your
  -server for every last ounce of performance) then you should add
  -<CODE>-DDYNAMIC_MODULE_LIMIT=0</CODE> when building your server.
  -This will save RAM that's allocated only for supporting dynamically
  -loaded modules.
  -
  -<hr>
   
  -<H3><a name="trace">Appendix: Detailed Analysis of a Trace</a></H3>
  -
  -Here is a system call trace of Apache 1.3 running on Linux.  The run-time
  -configuration file is essentially the default plus:
  +    close (s);
  +    }
  +</pre>
  +    </blockquote>
  +    This naturally adds some expense at the end of a connection,
  +    but it is required for a reliable implementation. As HTTP/1.1
  +    becomes more prevalent, and all connections are persistent,
  +    this expense will be amortized over more requests. If you want
  +    to play with fire and disable this feature you can define
  +    <code>NO_LINGCLOSE</code>, but this is not recommended at all.
  +    In particular, as HTTP/1.1 pipelined persistent connections
  +    come into use <code>lingering_close</code> is an absolute
  +    necessity (and <a
  +    href="http://www.w3.org/Protocols/HTTP/Performance/Pipeline.html">
  +    pipelined connections are faster</a>, so you want to support
  +    them). 
  +
  +    <h4>Scoreboard File</h4>
  +
  +    <p>Apache's parent and children communicate with each other
  +    through something called the scoreboard. Ideally this should be
  +    implemented in shared memory. For those operating systems that
  +    we either have access to, or have been given detailed ports
  +    for, it typically is implemented using shared memory. The rest
  +    default to using an on-disk file. The on-disk file is not only
  +    slow, but it is unreliable (and less featured). Peruse the
  +    <code>src/main/conf.h</code> file for your architecture and
  +    look for either <code>USE_MMAP_SCOREBOARD</code> or
  +    <code>USE_SHMGET_SCOREBOARD</code>. Defining one of those two
  +    (as well as their companions <code>HAVE_MMAP</code> and
  +    <code>HAVE_SHMGET</code> respectively) enables the supplied
  +    shared memory code. If your system has another type of shared
  +    memory, edit the file <code>src/main/http_main.c</code> and add
  +    the hooks necessary to use it in Apache. (Send us back a patch
  +    too please.)</p>
  +
  +    <p>Historical note: The Linux port of Apache didn't start to
  +    use shared memory until version 1.2 of Apache. This oversight
  +    resulted in really poor and unreliable behaviour of earlier
  +    versions of Apache on Linux.</p>
  +
  +    <h4><code>DYNAMIC_MODULE_LIMIT</code></h4>
  +
  +    <p>If you have no intention of using dynamically loaded modules
  +    (you probably don't if you're reading this and tuning your
  +    server for every last ounce of performance) then you should add
  +    <code>-DDYNAMIC_MODULE_LIMIT=0</code> when building your
  +    server. This will save RAM that's allocated only for supporting
  +    dynamically loaded modules.</p>
  +    <hr />
  +
  +    <h3><a id="trace" name="trace">Appendix: Detailed Analysis of a
  +    Trace</a></h3>
  +    Here is a system call trace of Apache 1.3 running on Linux. The
  +    run-time configuration file is essentially the default plus: 
   
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   &lt;Directory /&gt;
       AllowOverride none
       Options FollowSymLinks
   &lt;/Directory&gt;
  -</PRE></BLOCKQUOTE>
  -
  -The file being requested is a static 6K file of no particular content.
  -Traces of non-static requests or requests with content negotiation
  -look wildly different (and quite ugly in some cases).  First the
  -entire trace, then we'll examine details.  (This was generated by
  -the <CODE>strace</CODE> program, other similar programs include
  -<CODE>truss</CODE>, <CODE>ktrace</CODE>, and <CODE>par</CODE>.)
  +</pre>
  +    </blockquote>
  +    The file being requested is a static 6K file of no particular
  +    content. Traces of non-static requests or requests with content
  +    negotiation look wildly different (and quite ugly in some
  +    cases). First the entire trace, then we'll examine details.
  +    (This was generated by the <code>strace</code> program, other
  +    similar programs include <code>truss</code>,
  +    <code>ktrace</code>, and <code>par</code>.) 
   
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   accept(15, {sin_family=AF_INET, sin_port=htons(22283), sin_addr=inet_addr("127.0.0.1")}, [16]) = 3
   flock(18, LOCK_UN)                      = 0
   sigaction(SIGUSR1, {SIG_IGN}, {0x8059954, [], SA_INTERRUPT}) = 0
  @@ -643,203 +725,232 @@
   sigaction(SIGUSR1, {0x8059954, [], SA_INTERRUPT}, {SIG_IGN}) = 0
   munmap(0x400ee000, 6144)                = 0
   flock(18, LOCK_EX)                      = 0
  -</PRE></BLOCKQUOTE>
  +</pre>
  +    </blockquote>
   
  -<P>Notice the accept serialization:
  +    <p>Notice the accept serialization:</p>
   
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   flock(18, LOCK_UN)                      = 0
   ...
   flock(18, LOCK_EX)                      = 0
  -</PRE></BLOCKQUOTE>
  +</pre>
  +    </blockquote>
  +    These two calls can be removed by defining
  +    <code>SINGLE_LISTEN_UNSERIALIZED_ACCEPT</code> as described
  +    earlier. 
   
  -These two calls can be removed by defining
  -<CODE>SINGLE_LISTEN_UNSERIALIZED_ACCEPT</CODE> as described earlier.
  +    <p>Notice the <code>SIGUSR1</code> manipulation:</p>
   
  -<P>Notice the <CODE>SIGUSR1</CODE> manipulation:
  -
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   sigaction(SIGUSR1, {SIG_IGN}, {0x8059954, [], SA_INTERRUPT}) = 0
   ...
   sigaction(SIGUSR1, {SIG_IGN}, {SIG_IGN}) = 0
   ...
   sigaction(SIGUSR1, {0x8059954, [], SA_INTERRUPT}, {SIG_IGN}) = 0
  -</PRE></BLOCKQUOTE>
  -
  -This is caused by the implementation of graceful restarts.  When the
  -parent receives a <CODE>SIGUSR1</CODE> it sends a <CODE>SIGUSR1</CODE>
  -to all of its children (and it also increments a "generation counter"
  -in shared memory).  Any children that are idle (between connections)
  -will immediately die
  -off when they receive the signal.  Any children that are in keep-alive
  -connections, but are in between requests will die off immediately.  But
  -any children that have a connection and are still waiting for the first
  -request will not die off immediately.
  -
  -<P>To see why this is necessary, consider how a browser reacts to a closed
  -connection.  If the connection was a keep-alive connection and the request
  -being serviced was not the first request then the browser will quietly
  -reissue the request on a new connection.  It has to do this because the
  -server is always free to close a keep-alive connection in between requests
  -(<EM>i.e.</EM>, due to a timeout or because of a maximum number of requests).
  -But, if the connection is closed before the first response has been
  -received the typical browser will display a "document contains no data"
  -dialogue (or a broken image icon).  This is done on the assumption that
  -the server is broken in some way (or maybe too overloaded to respond
  -at all).  So Apache tries to avoid ever deliberately closing the connection
  -before it has sent a single response.  This is the cause of those
  -<CODE>SIGUSR1</CODE> manipulations.
  +</pre>
  +    </blockquote>
  +    This is caused by the implementation of graceful restarts. When
  +    the parent receives a <code>SIGUSR1</code> it sends a
  +    <code>SIGUSR1</code> to all of its children (and it also
  +    increments a "generation counter" in shared memory). Any
  +    children that are idle (between connections) will immediately
  +    die off when they receive the signal. Any children that are in
  +    keep-alive connections, but are in between requests will die
  +    off immediately. But any children that have a connection and
  +    are still waiting for the first request will not die off
  +    immediately. 
  +
  +    <p>To see why this is necessary, consider how a browser reacts
  +    to a closed connection. If the connection was a keep-alive
  +    connection and the request being serviced was not the first
  +    request then the browser will quietly reissue the request on a
  +    new connection. It has to do this because the server is always
  +    free to close a keep-alive connection in between requests
  +    (<em>i.e.</em>, due to a timeout or because of a maximum number
  +    of requests). But, if the connection is closed before the first
  +    response has been received the typical browser will display a
  +    "document contains no data" dialogue (or a broken image icon).
  +    This is done on the assumption that the server is broken in
  +    some way (or maybe too overloaded to respond at all). So Apache
  +    tries to avoid ever deliberately closing the connection before
  +    it has sent a single response. This is the cause of those
  +    <code>SIGUSR1</code> manipulations.</p>
  +
  +    <p>Note that it is theoretically possible to eliminate all
  +    three of these calls. But in rough tests the gain proved to be
  +    almost unnoticeable.</p>
   
  -<P>Note that it is theoretically possible to eliminate all three of
  -these calls.  But in rough tests the gain proved to be almost unnoticeable.
  +    <p>In order to implement virtual hosts, Apache needs to know
  +    the local socket address used to accept the connection:</p>
   
  -<P>In order to implement virtual hosts, Apache needs to know the
  -local socket address used to accept the connection:
  -
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   getsockname(3, {sin_family=AF_INET, sin_port=htons(8080), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
  -</PRE></BLOCKQUOTE>
  -
  -It is possible to eliminate this call in many situations (such as when
  -there are no virtual hosts, or when <CODE>Listen</CODE> directives are
  -used which do not have wildcard addresses).  But no effort has yet been
  -made to do these optimizations.
  +</pre>
  +    </blockquote>
  +    It is possible to eliminate this call in many situations (such
  +    as when there are no virtual hosts, or when <code>Listen</code>
  +    directives are used which do not have wildcard addresses). But
  +    no effort has yet been made to do these optimizations. 
   
  -<P>Apache turns off the Nagle algorithm:
  +    <p>Apache turns off the Nagle algorithm:</p>
   
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   setsockopt(3, IPPROTO_TCP1, [1], 4)     = 0
  -</PRE></BLOCKQUOTE>
  -
  -because of problems described in 
  -<A HREF="http://www.isi.edu/~johnh/PAPERS/Heidemann97a.html">a
  -paper by John Heidemann</A>.
  +</pre>
  +    </blockquote>
  +    because of problems described in <a
  +    href="http://www.isi.edu/~johnh/PAPERS/Heidemann97a.html">a
  +    paper by John Heidemann</a>. 
   
  -<P>Notice the two <CODE>time</CODE> calls:
  +    <p>Notice the two <code>time</code> calls:</p>
   
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   time(NULL)                              = 873959960
   ...
   time(NULL)                              = 873959960
  -</PRE></BLOCKQUOTE>
  -
  -One of these occurs at the beginning of the request, and the other occurs
  -as a result of writing the log.  At least one of these is required to
  -properly implement the HTTP protocol.  The second occurs because the
  -Common Log Format dictates that the log record include a timestamp of the
  -end of the request.  A custom logging module could eliminate one of the
  -calls.  Or you can use a method which moves the time into shared memory,
  -see the <A HREF="#patches">patches section below</A>.
  -
  -<P>As described earlier, <CODE>ExtendedStatus On</CODE> causes two
  -<CODE>gettimeofday</CODE> calls and a call to <CODE>times</CODE>:
  +</pre>
  +    </blockquote>
  +    One of these occurs at the beginning of the request, and the
  +    other occurs as a result of writing the log. At least one of
  +    these is required to properly implement the HTTP protocol. The
  +    second occurs because the Common Log Format dictates that the
  +    log record include a timestamp of the end of the request. A
  +    custom logging module could eliminate one of the calls. Or you
  +    can use a method which moves the time into shared memory, see
  +    the <a href="#patches">patches section below</a>. 
  +
  +    <p>As described earlier, <code>ExtendedStatus On</code> causes
  +    two <code>gettimeofday</code> calls and a call to
  +    <code>times</code>:</p>
   
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   gettimeofday({873959960, 404935}, NULL) = 0
   ...
   gettimeofday({873959960, 417742}, NULL) = 0
   times({tms_utime=5, tms_stime=0, tms_cutime=0, tms_cstime=0}) = 446747
  -</PRE></BLOCKQUOTE>
  +</pre>
  +    </blockquote>
  +    These can be removed by setting <code>ExtendedStatus Off</code>
  +    (which is the default). 
   
  -These can be removed by setting <CODE>ExtendedStatus Off</CODE> (which
  -is the default).
  +    <p>It might seem odd to call <code>stat</code>:</p>
   
  -<P>It might seem odd to call <CODE>stat</CODE>:
  -
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   stat("/home/dgaudet/ap/apachen/htdocs/6k", {st_mode=S_IFREG|0644, st_size=6144, ...}) = 0
  -</PRE></BLOCKQUOTE>
  -
  -This is part of the algorithm which calculates the
  -<CODE>PATH_INFO</CODE> for use by CGIs.  In fact if the request had been
  -for the URI <CODE>/cgi-bin/printenv/foobar</CODE> then there would be
  -two calls to <CODE>stat</CODE>.  The first for
  -<CODE>/home/dgaudet/ap/apachen/cgi-bin/printenv/foobar</CODE>
  -which does not exist, and the second for
  -<CODE>/home/dgaudet/ap/apachen/cgi-bin/printenv</CODE>, which does exist.
  -Regardless, at least one <CODE>stat</CODE> call is necessary when
  -serving static files because the file size and modification times are
  -used to generate HTTP headers (such as <CODE>Content-Length</CODE>,
  -<CODE>Last-Modified</CODE>) and implement protocol features (such
  -as <CODE>If-Modified-Since</CODE>).  A somewhat more clever server
  -could avoid the <CODE>stat</CODE> when serving non-static files,
  -however doing so in Apache is very difficult given the modular structure.
  +</pre>
  +    </blockquote>
  +    This is part of the algorithm which calculates the
  +    <code>PATH_INFO</code> for use by CGIs. In fact if the request
  +    had been for the URI <code>/cgi-bin/printenv/foobar</code> then
  +    there would be two calls to <code>stat</code>. The first for
  +    <code>/home/dgaudet/ap/apachen/cgi-bin/printenv/foobar</code>
  +    which does not exist, and the second for
  +    <code>/home/dgaudet/ap/apachen/cgi-bin/printenv</code>, which
  +    does exist. Regardless, at least one <code>stat</code> call is
  +    necessary when serving static files because the file size and
  +    modification times are used to generate HTTP headers (such as
  +    <code>Content-Length</code>, <code>Last-Modified</code>) and
  +    implement protocol features (such as
  +    <code>If-Modified-Since</code>). A somewhat more clever server
  +    could avoid the <code>stat</code> when serving non-static
  +    files, however doing so in Apache is very difficult given the
  +    modular structure. 
   
  -<P>All static files are served using <CODE>mmap</CODE>:
  +    <p>All static files are served using <code>mmap</code>:</p>
   
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   mmap(0, 6144, PROT_READ, MAP_PRIVATE, 4, 0) = 0x400ee000
   ...
   munmap(0x400ee000, 6144)                = 0
  -</PRE></BLOCKQUOTE>
  -
  -On some architectures it's slower to <CODE>mmap</CODE> small
  -files than it is to simply <CODE>read</CODE> them.  The define
  -<CODE>MMAP_THRESHOLD</CODE> can be set to the minimum
  -size required before using <CODE>mmap</CODE>.  By default
  -it's set to 0 (except on SunOS4 where experimentation has
  -shown 8192 to be a better value).  Using a tool such as <A
  -HREF="http://www.bitmover.com/lmbench/">lmbench</A> you
  -can determine the optimal setting for your environment.
  -
  -<P>You may also wish to experiment with <CODE>MMAP_SEGMENT_SIZE</CODE>
  -(default 32768) which determines the maximum number of bytes that
  -will be written at a time from mmap()d files.  Apache only resets the
  -client's <CODE>Timeout</CODE> in between write()s.  So setting this
  -large may lock out low bandwidth clients unless you also increase the
  -<CODE>Timeout</CODE>.
  -
  -<P>It may even be the case that <CODE>mmap</CODE> isn't
  -used on your architecture; if so then defining <CODE>USE_MMAP_FILES</CODE>
  -and <CODE>HAVE_MMAP</CODE> might work (if it works then report back to us).
  -
  -<P>Apache does its best to avoid copying bytes around in memory.  The
  -first write of any request typically is turned into a <CODE>writev</CODE>
  -which combines both the headers and the first hunk of data:
  +</pre>
  +    </blockquote>
  +    On some architectures it's slower to <code>mmap</code> small
  +    files than it is to simply <code>read</code> them. The define
  +    <code>MMAP_THRESHOLD</code> can be set to the minimum size
  +    required before using <code>mmap</code>. By default it's set to
  +    0 (except on SunOS4 where experimentation has shown 8192 to be
  +    a better value). Using a tool such as <a
  +    href="http://www.bitmover.com/lmbench/">lmbench</a> you can
  +    determine the optimal setting for your environment. 
  +
  +    <p>You may also wish to experiment with
  +    <code>MMAP_SEGMENT_SIZE</code> (default 32768) which determines
  +    the maximum number of bytes that will be written at a time from
  +    mmap()d files. Apache only resets the client's
  +    <code>Timeout</code> in between write()s. So setting this large
  +    may lock out low bandwidth clients unless you also increase the
  +    <code>Timeout</code>.</p>
  +
  +    <p>It may even be the case that <code>mmap</code> isn't used on
  +    your architecture; if so then defining
  +    <code>USE_MMAP_FILES</code> and <code>HAVE_MMAP</code> might
  +    work (if it works then report back to us).</p>
  +
  +    <p>Apache does its best to avoid copying bytes around in
  +    memory. The first write of any request typically is turned into
  +    a <code>writev</code> which combines both the headers and the
  +    first hunk of data:</p>
   
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   writev(3, [{"HTTP/1.1 200 OK\r\nDate: Thu, 11"..., 245}, {"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 6144}], 2) = 6389
  -</PRE></BLOCKQUOTE>
  +</pre>
  +    </blockquote>
  +    When doing HTTP/1.1 chunked encoding Apache will generate up to
  +    four element <code>writev</code>s. The goal is to push the byte
  +    copying into the kernel, where it typically has to happen
  +    anyhow (to assemble network packets). On testing, various
  +    Unixes (BSDI 2.x, Solaris 2.5, Linux 2.0.31+) properly combine
  +    the elements into network packets. Pre-2.0.31 Linux will not
  +    combine, and will create a packet for each element, so
  +    upgrading is a good idea. Defining <code>NO_WRITEV</code> will
  +    disable this combining, but result in very poor chunked
  +    encoding performance. 
   
  -When doing HTTP/1.1 chunked encoding Apache will generate up to four
  -element <CODE>writev</CODE>s.  The goal is to push the byte copying
  -into the kernel, where it typically has to happen anyhow (to assemble
  -network packets).  On testing, various Unixes (BSDI 2.x, Solaris 2.5,
  -Linux 2.0.31+) properly combine the elements into network packets.
  -Pre-2.0.31 Linux will not combine, and will create a packet for
  -each element, so upgrading is a good idea.  Defining <CODE>NO_WRITEV</CODE>
  -will disable this combining, but result in very poor chunked encoding
  -performance.
  +    <p>The log write:</p>
   
  -<P>The log write:
  -
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   write(17, "127.0.0.1 - - [10/Sep/1997:23:39"..., 71) = 71
  -</PRE></BLOCKQUOTE>
  -
  -can be deferred by defining <CODE>BUFFERED_LOGS</CODE>.  In this case
  -up to <CODE>PIPE_BUF</CODE> bytes (a POSIX defined constant) of log entries
  -are buffered before writing.  At no time does it split a log entry
  -across a <CODE>PIPE_BUF</CODE> boundary because those writes may not
  -be atomic.  (<EM>i.e.</EM>, entries from multiple children could become mixed together).
  -The code does its best to flush this buffer when a child dies.
  +</pre>
  +    </blockquote>
  +    can be deferred by defining <code>BUFFERED_LOGS</code>. In this
  +    case up to <code>PIPE_BUF</code> bytes (a POSIX defined
  +    constant) of log entries are buffered before writing. At no
  +    time does it split a log entry across a <code>PIPE_BUF</code>
  +    boundary because those writes may not be atomic.
  +    (<em>i.e.</em>, entries from multiple children could become
  +    mixed together). The code does its best to flush this buffer
  +    when a child dies. 
   
  -<P>The lingering close code causes four system calls:
  +    <p>The lingering close code causes four system calls:</p>
   
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   shutdown(3, 1 /* send */)               = 0
   oldselect(4, [3], NULL, [3], {2, 0})    = 1 (in [3], left {2, 0})
   read(3, "", 2048)                       = 0
   close(3)                                = 0
  -</PRE></BLOCKQUOTE>
  +</pre>
  +    </blockquote>
  +    which were described earlier. 
  +
  +    <p>Let's apply some of these optimizations:
  +    <code>-DSINGLE_LISTEN_UNSERIALIZED_ACCEPT
  +    -DBUFFERED_LOGS</code> and <code>ExtendedStatus Off</code>.
  +    Here's the final trace:</p>
   
  -which were described earlier.
  -
  -<P>Let's apply some of these optimizations:
  -<CODE>-DSINGLE_LISTEN_UNSERIALIZED_ACCEPT -DBUFFERED_LOGS</CODE> and
  -<CODE>ExtendedStatus Off</CODE>.  Here's the final trace:
  -
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
   accept(15, {sin_family=AF_INET, sin_port=htons(22286), sin_addr=inet_addr("127.0.0.1")}, [16]) = 3
   sigaction(SIGUSR1, {SIG_IGN}, {0x8058c98, [], SA_INTERRUPT}) = 0
   getsockname(3, {sin_family=AF_INET, sin_port=htons(8080), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
  @@ -859,85 +970,94 @@
   close(3)                                = 0
   sigaction(SIGUSR1, {0x8058c98, [], SA_INTERRUPT}, {SIG_IGN}) = 0
   munmap(0x400e3000, 6144)                = 0
  -</PRE></BLOCKQUOTE>
  -
  -That's 19 system calls, of which 4 remain relatively easy to remove,
  -but don't seem worth the effort.
  -
  -<H3><A NAME="patches">Appendix: Patches Available</A></H3>
  +</pre>
  +    </blockquote>
  +    That's 19 system calls, of which 4 remain relatively easy to
  +    remove, but don't seem worth the effort. 
  +
  +    <h3><a id="patches" name="patches">Appendix: Patches
  +    Available</a></h3>
  +    There are <a
  +    href="http://www.arctic.org/~dgaudet/apache/1.3/">several
  +    performance patches available for 1.3.</a> Although they may
  +    not apply cleanly to the current version, it shouldn't be
  +    difficult for someone with a little C knowledge to update them.
  +    In particular: 
  +
  +    <ul>
  +      <li>A <a
  +      href="http://www.arctic.org/~dgaudet/apache/1.3/shared_time.patch">
  +      patch</a> to remove all <code>time(2)</code> system
  +      calls.</li>
  +
  +      <li>A <a
  +      href="http://www.arctic.org/~dgaudet/apache/1.3/mod_include_speedups.patch">
  +      patch</a> to remove various system calls from
  +      <code>mod_include</code>, these calls are used by few sites
  +      but required for backwards compatibility.</li>
  +
  +      <li>A <a
  +      href="http://www.arctic.org/~dgaudet/apache/1.3/top_fuel.patch">
  +      patch</a> which integrates the above two plus a few other
  +      speedups at the cost of removing some functionality.</li>
  +    </ul>
  +
  +    <h3><a id="preforking" name="preforking">Appendix: The
  +    Pre-Forking Model</a></h3>
  +
  +    <p>Apache (on Unix) is a <em>pre-forking</em> model server. The
  +    <em>parent</em> process is responsible only for forking
  +    <em>child</em> processes, it does not serve any requests or
  +    service any network sockets. The child processes actually
  +    process connections, they serve multiple connections (one at a
  +    time) before dying. The parent spawns new or kills off old
  +    children in response to changes in the load on the server (it
  +    does so by monitoring a scoreboard which the children keep up
  +    to date).</p>
  +
  +    <p>This model for servers offers a robustness that other models
  +    do not. In particular, the parent code is very simple, and with
  +    a high degree of confidence the parent will continue to do its
  +    job without error. The children are complex, and when you add
  +    in third party code via modules, you risk segmentation faults
  +    and other forms of corruption. Even should such a thing happen,
  +    it only affects one connection and the server continues serving
  +    requests. The parent quickly replaces the dead child.</p>
  +
  +    <p>Pre-forking is also very portable across dialects of Unix.
  +    Historically this has been an important goal for Apache, and it
  +    continues to remain so.</p>
  +
  +    <p>The pre-forking model comes under criticism for various
  +    performance aspects. Of particular concern are the overhead of
  +    forking a process, the overhead of context switches between
  +    processes, and the memory overhead of having multiple
  +    processes. Furthermore it does not offer as many opportunities
  +    for data-caching between requests (such as a pool of
  +    <code>mmapped</code> files). Various other models exist and
  +    extensive analysis can be found in the <a
  +    href="http://www.cs.wustl.edu/~jxh/research/research.html">papers
  +    of the JAWS project</a>. In practice all of these costs vary
  +    drastically depending on the operating system.</p>
  +
  +    <p>Apache's core code is already multithread aware, and Apache
  +    version 1.3 is multithreaded on NT. There have been at least
  +    two other experimental implementations of threaded Apache, one
  +    using the 1.3 code base on DCE, and one using a custom
  +    user-level threads package and the 1.0 code base; neither is
  +    publicly available. There is also an experimental port of
  +    Apache 1.3 to <a
  +    href="http://www.mozilla.org/docs/refList/refNSPR/">Netscape's
  +    Portable Run Time</a>, which <a
  +    href="http://www.arctic.org/~dgaudet/apache/2.0/">is
  +    available</a> (but you're encouraged to join the <a
  +    href="http://dev.apache.org/mailing-lists">new-httpd mailing
  +    list</a> if you intend to use it). Part of our redesign for
  +    version 2.0 of Apache will include abstractions of the server
  +    model so that we can continue to support the pre-forking model,
  +    and also support various threaded models. 
  +    <!--#include virtual="footer.html" -->
  +    </p>
  +  </body>
  +</html>
   
  -There are
  -<A HREF="http://www.arctic.org/~dgaudet/apache/1.3/">
  -several performance patches available for 1.3.</A>  Although they may
  -not apply cleanly to the current version,
  -it shouldn't be difficult for someone with a little C knowledge to
  -update them.  In particular:
  -
  -<UL>
  -<LI>A 
  -<A HREF="http://www.arctic.org/~dgaudet/apache/1.3/shared_time.patch"
  ->patch</A> to remove all <CODE>time(2)</CODE> system calls.
  -<LI>A
  -<A HREF="http://www.arctic.org/~dgaudet/apache/1.3/mod_include_speedups.patch"
  ->patch</A> to remove various system calls from <CODE>mod_include</CODE>,
  -these calls are used by few sites but required for backwards compatibility.
  -<LI>A
  -<A HREF="http://www.arctic.org/~dgaudet/apache/1.3/top_fuel.patch"
  ->patch</A> which integrates the above two plus a few other speedups at the
  -cost of removing some functionality.
  -</UL>
  -
  -<H3><a name="preforking">Appendix: The Pre-Forking Model</a></H3>
  -
  -<P>Apache (on Unix) is a <EM>pre-forking</EM> model server.  The
  -<EM>parent</EM> process is responsible only for forking <EM>child</EM>
  -processes, it does not serve any requests or service any network
  -sockets.  The child processes actually process connections, they serve
  -multiple connections (one at a time) before dying.
  -The parent spawns new or kills off old
  -children in response to changes in the load on the server (it does so
  -by monitoring a scoreboard which the children keep up to date).
  -
  -<P>This model for servers offers a robustness that other models do
  -not.  In particular, the parent code is very simple, and with a high
  -degree of confidence the parent will continue to do its job without
  -error.  The children are complex, and when you add in third party
  -code via modules, you risk segmentation faults and other forms of
  -corruption.  Even should such a thing happen, it only affects one
  -connection and the server continues serving requests.  The parent
  -quickly replaces the dead child.
  -
  -<P>Pre-forking is also very portable across dialects of Unix.
  -Historically this has been an important goal for Apache, and it continues
  -to remain so.
  -
  -<P>The pre-forking model comes under criticism for various
  -performance aspects.  Of particular concern are the overhead
  -of forking a process, the overhead of context switches between
  -processes, and the memory overhead of having multiple processes.
  -Furthermore it does not offer as many opportunities for data-caching
  -between requests (such as a pool of <CODE>mmapped</CODE> files).
  -Various other models exist and extensive analysis can be found in the
  -<A HREF="http://www.cs.wustl.edu/~jxh/research/research.html"> papers
  -of the JAWS project</A>.  In practice all of these costs vary drastically
  -depending on the operating system.
  -
  -<P>Apache's core code is already multithread aware, and Apache version
  -1.3 is multithreaded on NT.  There have been at least two other experimental
  -implementations of threaded Apache, one using the 1.3 code base on DCE,
  -and one using a custom user-level threads package and the 1.0 code base;
  -neither is publicly available.  There is also an experimental port of
  -Apache 1.3 to <A HREF="http://www.mozilla.org/docs/refList/refNSPR/">
  -Netscape's Portable Run Time</A>, which
  -<A HREF="http://www.arctic.org/~dgaudet/apache/2.0/">is available</A>
  -(but you're encouraged to join the
  -<A HREF="http://dev.apache.org/mailing-lists">new-httpd mailing list</A>
  -if you intend to use it).
  -Part of our redesign for version 2.0
  -of Apache will include abstractions of the server model so that we
  -can continue to support the pre-forking model, and also support various
  -threaded models.
  -
  -<!--#include virtual="footer.html" -->
  -</BODY>
  -</HTML>
  
  
  
  1.8       +1810 -1356httpd-2.0/docs/manual/misc/rewriteguide.html
  
  Index: rewriteguide.html
  ===================================================================
  RCS file: /home/cvs/httpd-2.0/docs/manual/misc/rewriteguide.html,v
  retrieving revision 1.7
  retrieving revision 1.8
  diff -u -r1.7 -r1.8
  --- rewriteguide.html	2000/02/11 08:58:28	1.7
  +++ rewriteguide.html	2001/09/22 19:33:40	1.8
  @@ -1,114 +1,129 @@
  -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
  -<HTML><HEAD>
  -<TITLE>Apache 1.3 URL Rewriting Guide</TITLE>
  -</HEAD>
  -
  -<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  -<BODY
  - BGCOLOR="#FFFFFF"
  - TEXT="#000000"
  - LINK="#0000FF"
  - VLINK="#000080"
  - ALINK="#FF0000"
  ->
  -<BLOCKQUOTE>
  -<!--#include virtual="header.html" -->
  -
  -<DIV ALIGN=CENTER>
  -
  -<H1>
  -Apache 1.3<BR>
  -URL Rewriting Guide<BR>
  -</H1>
  -
  -<ADDRESS>Originally written by<BR>
  -Ralf S. Engelschall &lt;rse@apache.org&gt<BR>
  -December 1997</ADDRESS>
  -
  -</DIV>
  -
  -<P>
  -This document supplements the mod_rewrite <A
  -HREF="../mod/mod_rewrite.html">reference documentation</A>. It describes
  -how one can use Apache's mod_rewrite to solve typical URL-based problems
  -webmasters are usually confronted with in practice. I give detailed
  -descriptions on how to solve each problem by configuring URL rewriting
  -rulesets.
  -
  -<H2><A name="ToC1">Introduction to mod_rewrite</A></H2>
  -
  -The Apache module mod_rewrite is a killer one, i.e. it is a really
  -sophisticated module which provides a powerful way to do URL manipulations.
  -With it you can nearly do all types of URL manipulations you ever dreamed
  -about. The price you have to pay is to accept complexity, because
  -mod_rewrite's major drawback is that it is not easy to understand and use for
  -the beginner. And even Apache experts sometimes discover new aspects where
  -mod_rewrite can help.
  -<P>
  -In other words: With mod_rewrite you either shoot yourself in the foot the
  -first time and never use it again or love it for the rest of your life because
  -of its power. This paper tries to give you a few initial success events to
  -avoid the first case by presenting already invented solutions to you.
  -
  -<H2><A name="ToC2">Practical Solutions</A></H2>
  -
  -Here come a lot of practical solutions I've either invented myself or
  -collected from other peoples solutions in the past. Feel free to learn the
  -black magic of URL rewriting from these examples.
  -
  -<P>
  -<TABLE BGCOLOR="#FFE0E0" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD>
  -ATTENTION: Depending on your server-configuration it can be necessary to
  -slightly change the examples for your situation, e.g. adding the [PT] flag
  -when additionally using mod_alias and mod_userdir, etc. Or rewriting a ruleset
  -to fit in <CODE>.htaccess</CODE> context instead of per-server context. Always try
  -to understand what a particular ruleset really does before you use it. It
  -avoid problems.
  -</TD></TR></TABLE>
  -
  -<H1>URL Layout</H1>
  -
  -<P>
  -<H2>Canonical URLs</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -On some webservers there are more than one URL for a resource.  Usually there
  -are canonical URLs (which should be actually used and distributed) and those
  -which are just shortcuts, internal ones, etc.  Independed which URL the user
  -supplied with the request he should finally see the canonical one only.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -We do an external HTTP redirect for all non-canonical URLs to fix them in the
  -location view of the Browser and for all subsequent requests. In the example
  -ruleset below we replace <CODE>/~user</CODE> by the canonical <CODE>/u/user</CODE> and
  -fix a missing trailing slash for <CODE>/u/user</CODE>.
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  -RewriteRule   ^/<STRONG>~</STRONG>([^/]+)/?(.*)    /<STRONG>u</STRONG>/$1/$2  [<STRONG>R</STRONG>]
  -RewriteRule   ^/([uge])/(<STRONG>[^/]+</STRONG>)$  /$1/$2<STRONG>/</STRONG>   [<STRONG>R</STRONG>]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Canonical Hostnames</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -...
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  +    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
   
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +<html xmlns="http://www.w3.org/1999/xhtml">
  +  <head>
  +    <meta name="generator" content="HTML Tidy, see www.w3.org" />
  +
  +    <title>Apache 1.3 URL Rewriting Guide</title>
  +  </head>
  +  <!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  +
  +  <body bgcolor="#FFFFFF" text="#000000" link="#0000FF"
  +  vlink="#000080" alink="#FF0000">
  +    <blockquote>
  +      <!--#include virtual="header.html" -->
  +
  +      <div align="CENTER">
  +        <h1>Apache 1.3<br />
  +         URL Rewriting Guide<br />
  +        </h1>
  +
  +        <address>
  +          Originally written by<br />
  +           Ralf S. Engelschall &lt;rse@apache.org&gt;<br />
  +           December 1997
  +        </address>
  +      </div>
  +
  +      <p>This document supplements the mod_rewrite <a
  +      href="../mod/mod_rewrite.html">reference documentation</a>.
  +      It describes how one can use Apache's mod_rewrite to solve
  +      typical URL-based problems webmasters are usually confronted
  +      with in practice. I give detailed descriptions on how to
  +      solve each problem by configuring URL rewriting rulesets.</p>
  +
  +      <h2><a id="ToC1" name="ToC1">Introduction to
  +      mod_rewrite</a></h2>
  +      The Apache module mod_rewrite is a killer one, i.e. it is a
  +      really sophisticated module which provides a powerful way to
  +      do URL manipulations. With it you can nearly do all types of
  +      URL manipulations you ever dreamed about. The price you have
  +      to pay is to accept complexity, because mod_rewrite's major
  +      drawback is that it is not easy to understand and use for the
  +      beginner. And even Apache experts sometimes discover new
  +      aspects where mod_rewrite can help. 
  +
  +      <p>In other words: With mod_rewrite you either shoot yourself
  +      in the foot the first time and never use it again or love it
  +      for the rest of your life because of its power. This paper
  +      tries to give you a few initial success events to avoid the
  +      first case by presenting already invented solutions to
  +      you.</p>
  +
  +      <h2><a id="ToC2" name="ToC2">Practical Solutions</a></h2>
  +      Here come a lot of practical solutions I've either invented
  +      myself or collected from other peoples solutions in the past.
  +      Feel free to learn the black magic of URL rewriting from
  +      these examples. 
  +
  +      <table bgcolor="#FFE0E0" border="0" cellspacing="0"
  +      cellpadding="5">
  +        <tr>
  +          <td>ATTENTION: Depending on your server-configuration it
  +          can be necessary to slightly change the examples for your
  +          situation, e.g. adding the [PT] flag when additionally
  +          using mod_alias and mod_userdir, etc. Or rewriting a
  +          ruleset to fit in <code>.htaccess</code> context instead
  +          of per-server context. Always try to understand what a
  +          particular ruleset really does before you use it. It
  +          avoid problems.</td>
  +        </tr>
  +      </table>
  +
  +      <h1>URL Layout</h1>
  +
  +      <h2>Canonical URLs</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>On some webservers there are more than one URL for a
  +        resource. Usually there are canonical URLs (which should be
  +        actually used and distributed) and those which are just
  +        shortcuts, internal ones, etc. Independed which URL the
  +        user supplied with the request he should finally see the
  +        canonical one only.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          We do an external HTTP redirect for all non-canonical
  +          URLs to fix them in the location view of the Browser and
  +          for all subsequent requests. In the example ruleset below
  +          we replace <code>/~user</code> by the canonical
  +          <code>/u/user</code> and fix a missing trailing slash for
  +          <code>/u/user</code>. 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
  +RewriteRule   ^/<strong>~</strong>([^/]+)/?(.*)    /<strong>u</strong>/$1/$2  [<strong>R</strong>]
  +RewriteRule   ^/([uge])/(<strong>[^/]+</strong>)$  /$1/$2<strong>/</strong>   [<strong>R</strong>]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Canonical Hostnames</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>...</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteCond %{HTTP_HOST}   !^fully\.qualified\.domain\.name [NC]
   RewriteCond %{HTTP_HOST}   !^$
   RewriteCond %{SERVER_PORT} !^80$
  @@ -116,228 +131,281 @@
   RewriteCond %{HTTP_HOST}   !^fully\.qualified\.domain\.name [NC]
   RewriteCond %{HTTP_HOST}   !^$
   RewriteRule ^/(.*)         http://fully.qualified.domain.name/$1 [L,R]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Moved DocumentRoot</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -Usually the DocumentRoot of the webserver directly relates to the URL
  -``<CODE>/</CODE>''. But often this data is not really of top-level priority, it is
  -perhaps just one entity of a lot of data pools. For instance at our Intranet
  -sites there are <CODE>/e/www/</CODE> (the homepage for WWW), <CODE>/e/sww/</CODE> (the
  -homepage for the Intranet) etc. Now because the data of the DocumentRoot stays
  -at <CODE>/e/www/</CODE> we had to make sure that all inlined images and other
  -stuff inside this data pool work for subsequent requests. 
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -We just redirect the URL <CODE>/</CODE> to <CODE>/e/www/</CODE>.  While is seems
  -trivial it is actually trivial with mod_rewrite, only.  Because the typical
  -old mechanisms of URL <EM>Aliases</EM> (as provides by mod_alias and friends)
  -only used <EM>prefix</EM> matching. With this you cannot do such a redirection
  -because the DocumentRoot is a prefix of all URLs. With mod_rewrite it is
  -really trivial:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Moved DocumentRoot</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>Usually the DocumentRoot of the webserver directly
  +        relates to the URL ``<code>/</code>''. But often this data
  +        is not really of top-level priority, it is perhaps just one
  +        entity of a lot of data pools. For instance at our Intranet
  +        sites there are <code>/e/www/</code> (the homepage for
  +        WWW), <code>/e/sww/</code> (the homepage for the Intranet)
  +        etc. Now because the data of the DocumentRoot stays at
  +        <code>/e/www/</code> we had to make sure that all inlined
  +        images and other stuff inside this data pool work for
  +        subsequent requests.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          We just redirect the URL <code>/</code> to
  +          <code>/e/www/</code>. While is seems trivial it is
  +          actually trivial with mod_rewrite, only. Because the
  +          typical old mechanisms of URL <em>Aliases</em> (as
  +          provides by mod_alias and friends) only used
  +          <em>prefix</em> matching. With this you cannot do such a
  +          redirection because the DocumentRoot is a prefix of all
  +          URLs. With mod_rewrite it is really trivial: 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine on
  -RewriteRule   <STRONG>^/$</STRONG>  /e/www/  [<STRONG>R</STRONG>]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Trailing Slash Problem</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -Every webmaster can sing a song about the problem of the trailing slash on
  -URLs referencing directories. If they are missing, the server dumps an error,
  -because if you say <CODE>/~quux/foo</CODE> instead of
  -<CODE>/~quux/foo/</CODE> then the server searches for a <EM>file</EM> named
  -<CODE>foo</CODE>. And because this file is a directory it complains. Actually
  -is tries to fix it themself in most of the cases, but sometimes this mechanism
  -need to be emulated by you. For instance after you have done a lot of
  -complicated URL rewritings to CGI scripts etc. 
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -The solution to this subtle problem is to let the server add the trailing
  -slash automatically. To do this correctly we have to use an external redirect,
  -so the browser correctly requests subsequent images etc. If we only did a
  -internal rewrite, this would only work for the directory page, but would go
  -wrong when any images are included into this page with relative URLs, because
  -the browser would request an in-lined object. For instance, a request for
  -<CODE>image.gif</CODE> in <CODE>/~quux/foo/index.html</CODE> would become
  -<CODE>/~quux/image.gif</CODE> without the external redirect!
  -<P>
  -So, to do this trick we write:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteRule   <strong>^/$</strong>  /e/www/  [<strong>R</strong>]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Trailing Slash Problem</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>Every webmaster can sing a song about the problem of
  +        the trailing slash on URLs referencing directories. If they
  +        are missing, the server dumps an error, because if you say
  +        <code>/~quux/foo</code> instead of <code>/~quux/foo/</code>
  +        then the server searches for a <em>file</em> named
  +        <code>foo</code>. And because this file is a directory it
  +        complains. Actually is tries to fix it themself in most of
  +        the cases, but sometimes this mechanism need to be emulated
  +        by you. For instance after you have done a lot of
  +        complicated URL rewritings to CGI scripts etc.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          The solution to this subtle problem is to let the server
  +          add the trailing slash automatically. To do this
  +          correctly we have to use an external redirect, so the
  +          browser correctly requests subsequent images etc. If we
  +          only did a internal rewrite, this would only work for the
  +          directory page, but would go wrong when any images are
  +          included into this page with relative URLs, because the
  +          browser would request an in-lined object. For instance, a
  +          request for <code>image.gif</code> in
  +          <code>/~quux/foo/index.html</code> would become
  +          <code>/~quux/image.gif</code> without the external
  +          redirect! 
  +
  +          <p>So, to do this trick we write:</p>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine  on
   RewriteBase    /~quux/
  -RewriteRule    ^foo<STRONG>$</STRONG>  foo<STRONG>/</STRONG>  [<STRONG>R</STRONG>]
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -The crazy and lazy can even do the following in the top-level
  -<CODE>.htaccess</CODE> file of their homedir. But notice that this creates some
  -processing overhead.
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteRule    ^foo<strong>$</strong>  foo<strong>/</strong>  [<strong>R</strong>]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>The crazy and lazy can even do the following in the
  +          top-level <code>.htaccess</code> file of their homedir.
  +          But notice that this creates some processing
  +          overhead.</p>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine  on
   RewriteBase    /~quux/
  -RewriteCond    %{REQUEST_FILENAME}  <STRONG>-d</STRONG>
  -RewriteRule    ^(.+<STRONG>[^/]</STRONG>)$           $1<STRONG>/</STRONG>  [R]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Webcluster through Homogeneous URL Layout</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -We want to create a homogenous and consistent URL layout over all WWW servers
  -on a Intranet webcluster, i.e. all URLs (per definition server local and thus
  -server dependent!) become actually server <EM>independed</EM>!  What we want is
  -to give the WWW namespace a consistent server-independend layout: no URL
  -should have to include any physically correct target server. The cluster
  -itself should drive us automatically to the physical target host.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -First, the knowledge of the target servers come from (distributed) external
  -maps which contain information where our users, groups and entities stay.  
  -The have the form
  -
  -<P><PRE>
  +RewriteCond    %{REQUEST_FILENAME}  <strong>-d</strong>
  +RewriteRule    ^(.+<strong>[^/]</strong>)$           $1<strong>/</strong>  [R]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Webcluster through Homogeneous URL Layout</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>We want to create a homogenous and consistent URL
  +        layout over all WWW servers on a Intranet webcluster, i.e.
  +        all URLs (per definition server local and thus server
  +        dependent!) become actually server <em>independed</em>!
  +        What we want is to give the WWW namespace a consistent
  +        server-independend layout: no URL should have to include
  +        any physically correct target server. The cluster itself
  +        should drive us automatically to the physical target
  +        host.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          First, the knowledge of the target servers come from
  +          (distributed) external maps which contain information
  +          where our users, groups and entities stay. The have the
  +          form 
  +<pre>
   user1  server_of_user1
   user2  server_of_user2
   :      :
  -</PRE><P>
  +</pre>
   
  -We put them into files <CODE>map.xxx-to-host</CODE>.  Second we need to instruct
  -all servers to redirect URLs of the forms
  -
  -<P><PRE>
  +          <p>We put them into files <code>map.xxx-to-host</code>.
  +          Second we need to instruct all servers to redirect URLs
  +          of the forms</p>
  +<pre>
   /u/user/anypath
   /g/group/anypath
   /e/entity/anypath
  -</PRE><P>
  -
  -to
  +</pre>
   
  -<P><PRE>
  +          <p>to</p>
  +<pre>
   http://physical-host/u/user/anypath
   http://physical-host/g/group/anypath
   http://physical-host/e/entity/anypath
  -</PRE><P>
  -
  -when the URL is not locally valid to a server.  The following ruleset does
  -this for us by the help of the map files (assuming that server0 is a default
  -server which will be used if a user has no entry in the map):
  +</pre>
   
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +          <p>when the URL is not locally valid to a server. The
  +          following ruleset does this for us by the help of the map
  +          files (assuming that server0 is a default server which
  +          will be used if a user has no entry in the map):</p>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine on
   
   RewriteMap      user-to-host   txt:/path/to/map.user-to-host
   RewriteMap     group-to-host   txt:/path/to/map.group-to-host
   RewriteMap    entity-to-host   txt:/path/to/map.entity-to-host
   
  -RewriteRule   ^/u/<STRONG>([^/]+)</STRONG>/?(.*)   http://<STRONG>${user-to-host:$1|server0}</STRONG>/u/$1/$2
  -RewriteRule   ^/g/<STRONG>([^/]+)</STRONG>/?(.*)  http://<STRONG>${group-to-host:$1|server0}</STRONG>/g/$1/$2
  -RewriteRule   ^/e/<STRONG>([^/]+)</STRONG>/?(.*) http://<STRONG>${entity-to-host:$1|server0}</STRONG>/e/$1/$2
  +RewriteRule   ^/u/<strong>([^/]+)</strong>/?(.*)   http://<strong>${user-to-host:$1|server0}</strong>/u/$1/$2
  +RewriteRule   ^/g/<strong>([^/]+)</strong>/?(.*)  http://<strong>${group-to-host:$1|server0}</strong>/g/$1/$2
  +RewriteRule   ^/e/<strong>([^/]+)</strong>/?(.*) http://<strong>${entity-to-host:$1|server0}</strong>/e/$1/$2
   
   RewriteRule   ^/([uge])/([^/]+)/?$          /$1/$2/.www/
   RewriteRule   ^/([uge])/([^/]+)/([^.]+.+)   /$1/$2/.www/$3\
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Move Homedirs to Different Webserver</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -A lot of webmaster aksed for a solution to the following situation: They
  -wanted to redirect just all homedirs on a webserver to another webserver.
  -They usually need such things when establishing a newer webserver which will
  -replace the old one over time.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -The solution is trivial with mod_rewrite. On the old webserver we just
  -redirect all <CODE>/~user/anypath</CODE> URLs to
  -<CODE>http://newserver/~user/anypath</CODE>.
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Move Homedirs to Different Webserver</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>A lot of webmaster aksed for a solution to the
  +        following situation: They wanted to redirect just all
  +        homedirs on a webserver to another webserver. They usually
  +        need such things when establishing a newer webserver which
  +        will replace the old one over time.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          The solution is trivial with mod_rewrite. On the old
  +          webserver we just redirect all
  +          <code>/~user/anypath</code> URLs to
  +          <code>http://newserver/~user/anypath</code>. 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine on
  -RewriteRule   ^/~(.+)  http://<STRONG>newserver</STRONG>/~$1  [R,L]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Structured Homedirs</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -Some sites with thousend of users usually use a structured homedir layout,
  -i.e.  each homedir is in a subdirectory which begins for instance with the
  -first character of the username. So, <CODE>/~foo/anypath</CODE> is
  -<CODE>/home/<STRONG>f</STRONG>/foo/.www/anypath</CODE> while <CODE>/~bar/anypath</CODE> is
  -<CODE>/home/<STRONG>b</STRONG>/bar/.www/anypath</CODE>.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -We use the following ruleset to expand the tilde URLs into exactly the above
  -layout.
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteRule   ^/~(.+)  http://<strong>newserver</strong>/~$1  [R,L]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Structured Homedirs</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>Some sites with thousend of users usually use a
  +        structured homedir layout, i.e. each homedir is in a
  +        subdirectory which begins for instance with the first
  +        character of the username. So, <code>/~foo/anypath</code>
  +        is <code>/home/<strong>f</strong>/foo/.www/anypath</code>
  +        while <code>/~bar/anypath</code> is
  +        <code>/home/<strong>b</strong>/bar/.www/anypath</code>.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          We use the following ruleset to expand the tilde URLs
  +          into exactly the above layout. 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine on
  -RewriteRule   ^/~(<STRONG>([a-z])</STRONG>[a-z0-9]+)(.*)  /home/<STRONG>$2</STRONG>/$1/.www$3
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Filesystem Reorganisation</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -This really is a hardcore example: a killer application which heavily uses
  -per-directory <CODE>RewriteRules</CODE> to get a smooth look and feel on the Web
  -while its data structure is never touched or adjusted.
  -
  -Background: <STRONG><EM>net.sw</EM></STRONG> is my archive of freely available Unix
  -software packages, which I started to collect in 1992. It is both my hobby and
  -job to to this, because while I'm studying computer science I have also worked
  -for many years as a system and network administrator in my spare time. Every
  -week I need some sort of software so I created a deep hierarchy of
  -directories where I stored the packages: 
  -
  -<P><PRE>
  +RewriteRule   ^/~(<strong>([a-z])</strong>[a-z0-9]+)(.*)  /home/<strong>$2</strong>/$1/.www$3
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Filesystem Reorganisation</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>
  +          This really is a hardcore example: a killer application
  +          which heavily uses per-directory
  +          <code>RewriteRules</code> to get a smooth look and feel
  +          on the Web while its data structure is never touched or
  +          adjusted. Background: <strong><em>net.sw</em></strong> is
  +          my archive of freely available Unix software packages,
  +          which I started to collect in 1992. It is both my hobby
  +          and job to to this, because while I'm studying computer
  +          science I have also worked for many years as a system and
  +          network administrator in my spare time. Every week I need
  +          some sort of software so I created a deep hierarchy of
  +          directories where I stored the packages: 
  +<pre>
   drwxrwxr-x   2 netsw  users    512 Aug  3 18:39 Audio/
   drwxrwxr-x   2 netsw  users    512 Jul  9 14:37 Benchmark/
   drwxrwxr-x  12 netsw  users    512 Jul  9 00:34 Crypto/
  @@ -354,24 +422,27 @@
   drwxrwxr-x   7 netsw  users    512 Jul  9 12:17 System/
   drwxrwxr-x  12 netsw  users    512 Aug  3 20:15 Typesetting/
   drwxrwxr-x  10 netsw  users    512 Jul  9 14:08 X11/
  -</PRE><P>
  +</pre>
   
  -In July 1996 I decided to make this archive public to the world via a
  -nice Web interface. "Nice" means that I wanted to
  -offer an interface where you can browse directly through the archive hierarchy.
  -And "nice" means that I didn't wanted to change anything inside this hierarchy
  -- not even by putting some CGI scripts at the top of it.  Why? Because the
  -above structure should be later accessible via FTP as well, and I didn't
  -want any Web or CGI stuff to be there.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -The solution has two parts: The first is a set of CGI scripts which create all
  -the pages at all directory levels on-the-fly. I put them under
  -<CODE>/e/netsw/.www/</CODE> as follows:
  -
  -<P><PRE>
  +          <p>In July 1996 I decided to make this archive public to
  +          the world via a nice Web interface. "Nice" means that I
  +          wanted to offer an interface where you can browse
  +          directly through the archive hierarchy. And "nice" means
  +          that I didn't wanted to change anything inside this
  +          hierarchy - not even by putting some CGI scripts at the
  +          top of it. Why? Because the above structure should be
  +          later accessible via FTP as well, and I didn't want any
  +          Web or CGI stuff to be there.</p>
  +        </dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          The solution has two parts: The first is a set of CGI
  +          scripts which create all the pages at all directory
  +          levels on-the-fly. I put them under
  +          <code>/e/netsw/.www/</code> as follows: 
  +<pre>
   -rw-r--r--   1 netsw  users    1318 Aug  1 18:10 .wwwacl
   drwxr-xr-x  18 netsw  users     512 Aug  5 15:51 DATA/
   -rw-rw-rw-   1 netsw  users  372982 Aug  5 16:35 LOGFILE
  @@ -385,32 +456,45 @@
   -rwxr-xr-x   1 netsw  users    1589 Aug  3 18:43 netsw-search.cgi
   -rwxr-xr-x   1 netsw  users    1885 Aug  1 17:41 netsw-tree.cgi
   -rw-r--r--   1 netsw  users     234 Jul 30 16:35 netsw-unlimit.lst
  -</PRE><P>
  -
  -The <CODE>DATA/</CODE> subdirectory holds the above directory structure, i.e.  the
  -real <STRONG><EM>net.sw</EM></STRONG> stuff and gets automatically updated via
  -<CODE>rdist</CODE> from time to time. 
  -
  -The second part of the problem remains: how to link these two structures
  -together into one smooth-looking URL tree? We want to hide the <CODE>DATA/</CODE>
  -directory from the user while running the appropriate CGI scripts for the
  -various URLs. 
  -
  -Here is the solution: first I put the following into the per-directory
  -configuration file in the Document Root of the server to rewrite the announced
  -URL <CODE>/net.sw/</CODE> to the internal path <CODE>/e/netsw</CODE>:
  +</pre>
   
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +          <p>The <code>DATA/</code> subdirectory holds the above
  +          directory structure, i.e. the real
  +          <strong><em>net.sw</em></strong> stuff and gets
  +          automatically updated via <code>rdist</code> from time to
  +          time. The second part of the problem remains: how to link
  +          these two structures together into one smooth-looking URL
  +          tree? We want to hide the <code>DATA/</code> directory
  +          from the user while running the appropriate CGI scripts
  +          for the various URLs. Here is the solution: first I put
  +          the following into the per-directory configuration file
  +          in the Document Root of the server to rewrite the
  +          announced URL <code>/net.sw/</code> to the internal path
  +          <code>/e/netsw</code>:</p>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteRule  ^net.sw$       net.sw/        [R]
   RewriteRule  ^net.sw/(.*)$  e/netsw/$1
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -The first rule is for requests which miss the trailing slash!  The second rule
  -does the real thing. And then comes the killer configuration which stays in
  -the per-directory config file <CODE>/e/netsw/.www/.wwwacl</CODE>:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>The first rule is for requests which miss the trailing
  +          slash! The second rule does the real thing. And then
  +          comes the killer configuration which stays in the
  +          per-directory config file
  +          <code>/e/netsw/.www/.wwwacl</code>:</p>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   Options       ExecCGI FollowSymLinks Includes MultiViews 
   
   RewriteEngine on
  @@ -439,239 +523,309 @@
   #  by another cgi script
   RewriteRule   !^netsw-lsdir\.cgi.*     -                  [C]
   RewriteRule   (.*)                     netsw-lsdir.cgi/$1
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -Some hints for interpretation:
  -    <ol>
  -    <li> Notice the L (last) flag and no substitution field ('-') in the
  -         forth part
  -    <li> Notice the ! (not) character and the C (chain) flag
  -         at the first rule in the last part
  -    <li> Notice the catch-all pattern in the last rule
  -    </ol>
  -
  -</DL>
  -
  -<P>
  -<H2>NCSA imagemap to Apache mod_imap</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -When switching from the NCSA webserver to the more modern Apache webserver a
  -lot of people want a smooth transition. So they want pages which use their old
  -NCSA <CODE>imagemap</CODE> program to work under Apache with the modern
  -<CODE>mod_imap</CODE>. The problem is that there are a lot of
  -hyperlinks around which reference the <CODE>imagemap</CODE> program via
  -<CODE>/cgi-bin/imagemap/path/to/page.map</CODE>. Under Apache this
  -has to read just <CODE>/path/to/page.map</CODE>.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -We use a global rule to remove the prefix on-the-fly for all requests:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>Some hints for interpretation:</p>
  +
  +          <ol>
  +            <li>Notice the L (last) flag and no substitution field
  +            ('-') in the forth part</li>
  +
  +            <li>Notice the ! (not) character and the C (chain) flag
  +            at the first rule in the last part</li>
  +
  +            <li>Notice the catch-all pattern in the last rule</li>
  +          </ol>
  +        </dd>
  +      </dl>
  +
  +      <h2>NCSA imagemap to Apache mod_imap</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>When switching from the NCSA webserver to the more
  +        modern Apache webserver a lot of people want a smooth
  +        transition. So they want pages which use their old NCSA
  +        <code>imagemap</code> program to work under Apache with the
  +        modern <code>mod_imap</code>. The problem is that there are
  +        a lot of hyperlinks around which reference the
  +        <code>imagemap</code> program via
  +        <code>/cgi-bin/imagemap/path/to/page.map</code>. Under
  +        Apache this has to read just
  +        <code>/path/to/page.map</code>.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          We use a global rule to remove the prefix on-the-fly for
  +          all requests: 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine  on
   RewriteRule    ^/cgi-bin/imagemap(.*)  $1  [PT]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Search pages in more than one directory</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -Sometimes it is neccessary to let the webserver search for pages in more than
  -one directory. Here MultiViews or other techniques cannot help.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -We program a explicit ruleset which searches for the files in the directories.
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Search pages in more than one directory</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>Sometimes it is neccessary to let the webserver search
  +        for pages in more than one directory. Here MultiViews or
  +        other techniques cannot help.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          We program a explicit ruleset which searches for the
  +          files in the directories. 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine on
   
   #   first try to find it in custom/...
   #   ...and if found stop and be happy:
  -RewriteCond         /your/docroot/<STRONG>dir1</STRONG>/%{REQUEST_FILENAME}  -f
  -RewriteRule  ^(.+)  /your/docroot/<STRONG>dir1</STRONG>/$1  [L]
  +RewriteCond         /your/docroot/<strong>dir1</strong>/%{REQUEST_FILENAME}  -f
  +RewriteRule  ^(.+)  /your/docroot/<strong>dir1</strong>/$1  [L]
   
   #   second try to find it in pub/...
   #   ...and if found stop and be happy:
  -RewriteCond         /your/docroot/<STRONG>dir2</STRONG>/%{REQUEST_FILENAME}  -f
  -RewriteRule  ^(.+)  /your/docroot/<STRONG>dir2</STRONG>/$1  [L]
  +RewriteCond         /your/docroot/<strong>dir2</strong>/%{REQUEST_FILENAME}  -f
  +RewriteRule  ^(.+)  /your/docroot/<strong>dir2</strong>/$1  [L]
   
   #   else go on for other Alias or ScriptAlias directives,
   #   etc.
   RewriteRule   ^(.+)  -  [PT]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Set Environment Variables According To URL Parts</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -Perhaps you want to keep status information between requests and use the URL
  -to encode it. But you don't want to use a CGI wrapper for all pages just to
  -strip out this information.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -We use a rewrite rule to strip out the status information and remember it via
  -an environment variable which can be later dereferenced from within XSSI or
  -CGI. This way a URL <CODE>/foo/S=java/bar/</CODE> gets translated to
  -<CODE>/foo/bar/</CODE> and the environment variable named <CODE>STATUS</CODE> is set
  -to the value "java".
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Set Environment Variables According To URL Parts</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>Perhaps you want to keep status information between
  +        requests and use the URL to encode it. But you don't want
  +        to use a CGI wrapper for all pages just to strip out this
  +        information.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          We use a rewrite rule to strip out the status information
  +          and remember it via an environment variable which can be
  +          later dereferenced from within XSSI or CGI. This way a
  +          URL <code>/foo/S=java/bar/</code> gets translated to
  +          <code>/foo/bar/</code> and the environment variable named
  +          <code>STATUS</code> is set to the value "java". 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine on
  -RewriteRule   ^(.*)/<STRONG>S=([^/]+)</STRONG>/(.*)    $1/$3 [E=<STRONG>STATUS:$2</STRONG>]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Virtual User Hosts</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -Assume that you want to provide <CODE>www.<STRONG>username</STRONG>.host.domain.com</CODE>
  -for the homepage of username via just DNS A records to the same machine and
  -without any virtualhosts on this machine. 
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -For HTTP/1.0 requests there is no solution, but for HTTP/1.1 requests which
  -contain a Host: HTTP header we can use the following ruleset to rewrite
  -<CODE>http://www.username.host.com/anypath</CODE> internally to
  -<CODE>/home/username/anypath</CODE>:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteRule   ^(.*)/<strong>S=([^/]+)</strong>/(.*)    $1/$3 [E=<strong>STATUS:$2</strong>]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Virtual User Hosts</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>Assume that you want to provide
  +        <code>www.<strong>username</strong>.host.domain.com</code>
  +        for the homepage of username via just DNS A records to the
  +        same machine and without any virtualhosts on this
  +        machine.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          For HTTP/1.0 requests there is no solution, but for
  +          HTTP/1.1 requests which contain a Host: HTTP header we
  +          can use the following ruleset to rewrite
  +          <code>http://www.username.host.com/anypath</code>
  +          internally to <code>/home/username/anypath</code>: 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine on
  -RewriteCond   %{<STRONG>HTTP_HOST</STRONG>}                 ^www\.<STRONG>[^.]+</STRONG>\.host\.com$
  +RewriteCond   %{<strong>HTTP_HOST</strong>}                 ^www\.<strong>[^.]+</strong>\.host\.com$
   RewriteRule   ^(.+)                        %{HTTP_HOST}$1          [C]
  -RewriteRule   ^www\.<STRONG>([^.]+)</STRONG>\.host\.com(.*) /home/<STRONG>$1</STRONG>$2
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Redirect Homedirs For Foreigners</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -We want to redirect homedir URLs to another webserver
  -<CODE>www.somewhere.com</CODE> when the requesting user does not stay in the local
  -domain <CODE>ourdomain.com</CODE>. This is sometimes used in virtual host
  -contexts.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -Just a rewrite condition:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteRule   ^www\.<strong>([^.]+)</strong>\.host\.com(.*) /home/<strong>$1</strong>$2
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Redirect Homedirs For Foreigners</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>We want to redirect homedir URLs to another webserver
  +        <code>www.somewhere.com</code> when the requesting user
  +        does not stay in the local domain
  +        <code>ourdomain.com</code>. This is sometimes used in
  +        virtual host contexts.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          Just a rewrite condition: 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine on
  -RewriteCond   %{REMOTE_HOST}  <STRONG>!^.+\.ourdomain\.com$</STRONG>
  +RewriteCond   %{REMOTE_HOST}  <strong>!^.+\.ourdomain\.com$</strong>
   RewriteRule   ^(/~.+)         http://www.somewhere.com/$1 [R,L]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Redirect Failing URLs To Other Webserver</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -A typical FAQ about URL rewriting is how to redirect failing requests on
  -webserver A to webserver B.  Usually this is done via ErrorDocument
  -CGI-scripts in Perl, but there is also a mod_rewrite solution. But notice that
  -this is less performant than using a ErrorDocument CGI-script!
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -The first solution has the best performance but less flexibility and is less
  -error safe:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Redirect Failing URLs To Other Webserver</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>A typical FAQ about URL rewriting is how to redirect
  +        failing requests on webserver A to webserver B. Usually
  +        this is done via ErrorDocument CGI-scripts in Perl, but
  +        there is also a mod_rewrite solution. But notice that this
  +        is less performant than using a ErrorDocument
  +        CGI-script!</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          The first solution has the best performance but less
  +          flexibility and is less error safe: 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine on
  -RewriteCond   /your/docroot/%{REQUEST_FILENAME} <STRONG>!-f</STRONG>
  -RewriteRule   ^(.+)                             http://<STRONG>webserverB</STRONG>.dom/$1
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -The problem here is that this will only work for pages inside the
  -DocumentRoot. While you can add more Conditions (for instance to also handle
  -homedirs, etc.) there is better variant:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteCond   /your/docroot/%{REQUEST_FILENAME} <strong>!-f</strong>
  +RewriteRule   ^(.+)                             http://<strong>webserverB</strong>.dom/$1
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>The problem here is that this will only work for pages
  +          inside the DocumentRoot. While you can add more
  +          Conditions (for instance to also handle homedirs, etc.)
  +          there is better variant:</p>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine on
  -RewriteCond   %{REQUEST_URI} <STRONG>!-U</STRONG>
  -RewriteRule   ^(.+)          http://<STRONG>webserverB</STRONG>.dom/$1
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -This uses the URL look-ahead feature of mod_rewrite. The result is that this
  -will work for all types of URLs and is a safe way.  But it does a performance
  -impact on the webserver, because for every request there is one more internal
  -subrequest. So, if your webserver runs on a powerful CPU, use this one. If it
  -is a slow machine, use the first approach or better a ErrorDocument
  -CGI-script.
  -
  -</DL>
  -
  -<P>
  -<H2>Extended Redirection</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -Sometimes we need more control (concerning the character escaping mechanism)
  -of URLs on redirects. Usually the Apache kernels URL escape function also
  -escapes anchors, i.e. URLs like "url#anchor". You cannot use this directly on
  -redirects with mod_rewrite because the uri_escape() function of Apache would
  -also escape the hash character. How can we redirect to such a URL?
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -We have to use a kludge by the use of a NPH-CGI script which does the redirect
  -itself. Because here no escaping is done (NPH=non-parseable headers).  First
  -we introduce a new URL scheme <CODE>xredirect:</CODE> by the following per-server
  -config-line (should be one of the last rewrite rules):
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteCond   %{REQUEST_URI} <strong>!-U</strong>
  +RewriteRule   ^(.+)          http://<strong>webserverB</strong>.dom/$1
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>This uses the URL look-ahead feature of mod_rewrite.
  +          The result is that this will work for all types of URLs
  +          and is a safe way. But it does a performance impact on
  +          the webserver, because for every request there is one
  +          more internal subrequest. So, if your webserver runs on a
  +          powerful CPU, use this one. If it is a slow machine, use
  +          the first approach or better a ErrorDocument
  +          CGI-script.</p>
  +        </dd>
  +      </dl>
  +
  +      <h2>Extended Redirection</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>Sometimes we need more control (concerning the
  +        character escaping mechanism) of URLs on redirects. Usually
  +        the Apache kernels URL escape function also escapes
  +        anchors, i.e. URLs like "url#anchor". You cannot use this
  +        directly on redirects with mod_rewrite because the
  +        uri_escape() function of Apache would also escape the hash
  +        character. How can we redirect to such a URL?</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          We have to use a kludge by the use of a NPH-CGI script
  +          which does the redirect itself. Because here no escaping
  +          is done (NPH=non-parseable headers). First we introduce a
  +          new URL scheme <code>xredirect:</code> by the following
  +          per-server config-line (should be one of the last rewrite
  +          rules): 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteRule ^xredirect:(.+) /path/to/nph-xredirect.cgi/$1 \
               [T=application/x-httpd-cgi,L]
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -This forces all URLs prefixed with <CODE>xredirect:</CODE> to be piped through the
  -<CODE>nph-xredirect.cgi</CODE> program. And this program just looks like:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  -<PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>This forces all URLs prefixed with
  +          <code>xredirect:</code> to be piped through the
  +          <code>nph-xredirect.cgi</code> program. And this program
  +          just looks like:</p>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   #!/path/to/perl
   ##
   ##  nph-xredirect.cgi -- NPH/CGI script for extended redirects
  @@ -697,55 +851,79 @@
   print "&lt;/html&gt;\n";
   
   ##EOF##
  -</PRE>
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -This provides you with the functionality to do redirects to all URL schemes,
  -i.e. including the one which are not directly accepted by mod_rewrite. For
  -instance you can now also redirect to <CODE>news:newsgroup</CODE> via
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>This provides you with the functionality to do
  +          redirects to all URL schemes, i.e. including the one
  +          which are not directly accepted by mod_rewrite. For
  +          instance you can now also redirect to
  +          <code>news:newsgroup</code> via</p>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteRule ^anyurl  xredirect:news:newsgroup
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -Notice: You have not to put [R] or [R,L] to the above rule because the
  -<CODE>xredirect:</CODE> need to be expanded later by our special "pipe through"
  -rule above.
  -
  -</DL>
  -
  -<P>
  -<H2>Archive Access Multiplexer</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -Do you know the great CPAN (Comprehensive Perl Archive Network) under <A
  -HREF="http://www.perl.com/CPAN">http://www.perl.com/CPAN</A>? This does a
  -redirect to one of several FTP servers around the world which carry a CPAN
  -mirror and is approximately near the location of the requesting client.
  -Actually this can be called an FTP access multiplexing service. While CPAN
  -runs via CGI scripts, how can a similar approach implemented via mod_rewrite?
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -First we notice that from version 3.0.0 mod_rewrite can also use the "ftp:"
  -scheme on redirects. And second, the location approximation can be done by a
  -rewritemap over the top-level domain of the client. With a tricky chained
  -ruleset we can use this top-level domain as a key to our multiplexing map.
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>Notice: You have not to put [R] or [R,L] to the above
  +          rule because the <code>xredirect:</code> need to be
  +          expanded later by our special "pipe through" rule
  +          above.</p>
  +        </dd>
  +      </dl>
  +
  +      <h2>Archive Access Multiplexer</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>Do you know the great CPAN (Comprehensive Perl Archive
  +        Network) under <a
  +        href="http://www.perl.com/CPAN">http://www.perl.com/CPAN</a>?
  +        This does a redirect to one of several FTP servers around
  +        the world which carry a CPAN mirror and is approximately
  +        near the location of the requesting client. Actually this
  +        can be called an FTP access multiplexing service. While
  +        CPAN runs via CGI scripts, how can a similar approach
  +        implemented via mod_rewrite?</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          First we notice that from version 3.0.0 mod_rewrite can
  +          also use the "ftp:" scheme on redirects. And second, the
  +          location approximation can be done by a rewritemap over
  +          the top-level domain of the client. With a tricky chained
  +          ruleset we can use this top-level domain as a key to our
  +          multiplexing map. 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine on
   RewriteMap    multiplex                txt:/path/to/map.cxan
   RewriteRule   ^/CxAN/(.*)              %{REMOTE_HOST}::$1                 [C]
  -RewriteRule   ^.+\.<STRONG>([a-zA-Z]+)</STRONG>::(.*)$  ${multiplex:<STRONG>$1</STRONG>|ftp.default.dom}$2  [R,L]
  -</PRE></TD></TR></TABLE>
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteRule   ^.+\.<strong>([a-zA-Z]+)</strong>::(.*)$  ${multiplex:<strong>$1</strong>|ftp.default.dom}$2  [R,L]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   ##
   ##  map.cxan -- Multiplexing Map for CxAN
   ##
  @@ -755,62 +933,77 @@
   com       ftp://ftp.cxan.com/CxAN/
    :
   ##EOF##
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Time-Dependend Rewriting</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -When tricks like time-dependend content should happen a lot of webmasters
  -still use CGI scripts which do for instance redirects to specialized pages.
  -How can it be done via mod_rewrite?
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -There are a lot of variables named <CODE>TIME_xxx</CODE> for rewrite conditions.
  -In conjunction with the special lexicographic comparison patterns &lt;STRING,
  -&gt;STRING and =STRING we can do time-dependend redirects:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Time-Dependend Rewriting</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>When tricks like time-dependend content should happen a
  +        lot of webmasters still use CGI scripts which do for
  +        instance redirects to specialized pages. How can it be done
  +        via mod_rewrite?</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          There are a lot of variables named <code>TIME_xxx</code>
  +          for rewrite conditions. In conjunction with the special
  +          lexicographic comparison patterns &lt;STRING, &gt;STRING
  +          and =STRING we can do time-dependend redirects: 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine on
   RewriteCond   %{TIME_HOUR}%{TIME_MIN} &gt;0700
   RewriteCond   %{TIME_HOUR}%{TIME_MIN} &lt;1900
   RewriteRule   ^foo\.html$             foo.day.html
   RewriteRule   ^foo\.html$             foo.night.html
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -This provides the content of <CODE>foo.day.html</CODE> under the URL
  -<CODE>foo.html</CODE> from 07:00-19:00 and at the remaining time the contents of
  -<CODE>foo.night.html</CODE>. Just a nice feature for a homepage...
  -
  -</DL>
  -
  -<P>
  -<H2>Backward Compatibility for YYYY to XXXX migration</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -How can we make URLs backward compatible (still existing virtually) after
  -migrating document.YYYY to document.XXXX, e.g. after translating a bunch of
  -.html files to .phtml?
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -We just rewrite the name to its basename and test for existence of the new
  -extension. If it exists, we take that name, else we rewrite the URL to its
  -original state. 
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>This provides the content of <code>foo.day.html</code>
  +          under the URL <code>foo.html</code> from 07:00-19:00 and
  +          at the remaining time the contents of
  +          <code>foo.night.html</code>. Just a nice feature for a
  +          homepage...</p>
  +        </dd>
  +      </dl>
  +
  +      <h2>Backward Compatibility for YYYY to XXXX migration</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>How can we make URLs backward compatible (still
  +        existing virtually) after migrating document.YYYY to
  +        document.XXXX, e.g. after translating a bunch of .html
  +        files to .phtml?</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          We just rewrite the name to its basename and test for
  +          existence of the new extension. If it exists, we take
  +          that name, else we rewrite the URL to its original state.
  +          
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   #   backward compatibility ruleset for 
   #   rewriting document.html to document.phtml
   #   when and only when document.phtml exists
  @@ -825,237 +1018,307 @@
   #   else reverse the previous basename cutout
   RewriteCond   %{ENV:WasHTML}            ^yes$
   RewriteRule   ^(.*)$ $1.html
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<H1>Content Handling</H1>
  -
  -<P>
  -<H2>From Old to New (intern)</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -Assume we have recently renamed the page <CODE>bar.html</CODE> to
  -<CODE>foo.html</CODE> and now want to provide the old URL for backward
  -compatibility. Actually we want that users of the old URL even not recognize
  -that the pages was renamed.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -We rewrite the old URL to the new one internally via the following rule:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h1>Content Handling</h1>
  +
  +      <h2>From Old to New (intern)</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>Assume we have recently renamed the page
  +        <code>bar.html</code> to <code>foo.html</code> and now want
  +        to provide the old URL for backward compatibility. Actually
  +        we want that users of the old URL even not recognize that
  +        the pages was renamed.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          We rewrite the old URL to the new one internally via the
  +          following rule: 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine  on
   RewriteBase    /~quux/
  -RewriteRule    ^<STRONG>foo</STRONG>\.html$  <STRONG>bar</STRONG>.html
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>From Old to New (extern)</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -Assume again that we have recently renamed the page <CODE>bar.html</CODE> to
  -<CODE>foo.html</CODE> and now want to provide the old URL for backward
  -compatibility. But this time we want that the users of the old URL get hinted
  -to the new one, i.e. their browsers Location field should change, too.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -We force a HTTP redirect to the new URL which leads to a change of the
  -browsers and thus the users view:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteRule    ^<strong>foo</strong>\.html$  <strong>bar</strong>.html
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>From Old to New (extern)</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>Assume again that we have recently renamed the page
  +        <code>bar.html</code> to <code>foo.html</code> and now want
  +        to provide the old URL for backward compatibility. But this
  +        time we want that the users of the old URL get hinted to
  +        the new one, i.e. their browsers Location field should
  +        change, too.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          We force a HTTP redirect to the new URL which leads to a
  +          change of the browsers and thus the users view: 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine  on
   RewriteBase    /~quux/
  -RewriteRule    ^<STRONG>foo</STRONG>\.html$  <STRONG>bar</STRONG>.html  [<STRONG>R</STRONG>]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Browser Dependend Content</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -At least for important top-level pages it is sometimes necesarry to provide
  -the optimum of browser dependend content, i.e. one has to provide a maximum
  -version for the latest Netscape variants, a minimum version for the Lynx
  -browsers and a average feature version for all others.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -We cannot use content negotiation because the browsers do not provide their
  -type in that form. Instead we have to act on the HTTP header "User-Agent".
  -The following condig does the following: If the HTTP header "User-Agent"
  -begins with "Mozilla/3", the page <CODE>foo.html</CODE> is rewritten to
  -<CODE>foo.NS.html</CODE> and and the rewriting stops.  If the browser is "Lynx" or
  -"Mozilla" of version 1 or 2 the URL becomes <CODE>foo.20.html</CODE>.  All other
  -browsers receive page <CODE>foo.32.html</CODE>. This is done by the following
  -ruleset:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  -RewriteCond %{HTTP_USER_AGENT}  ^<STRONG>Mozilla/3</STRONG>.*
  -RewriteRule ^foo\.html$         foo.<STRONG>NS</STRONG>.html          [<STRONG>L</STRONG>]
  -
  -RewriteCond %{HTTP_USER_AGENT}  ^<STRONG>Lynx/</STRONG>.*         [OR]
  -RewriteCond %{HTTP_USER_AGENT}  ^<STRONG>Mozilla/[12]</STRONG>.*
  -RewriteRule ^foo\.html$         foo.<STRONG>20</STRONG>.html          [<STRONG>L</STRONG>]
  -
  -RewriteRule ^foo\.html$         foo.<STRONG>32</STRONG>.html          [<STRONG>L</STRONG>]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Dynamic Mirror</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -Assume there are nice webpages on remote hosts we want to bring into our
  -namespace. For FTP servers we would use the <CODE>mirror</CODE> program which
  -actually maintains an explicit up-to-date copy of the remote data on the local
  -machine. For a webserver we could use the program <CODE>webcopy</CODE> which acts
  -similar via HTTP. But both techniques have one major drawback: The local copy
  -is always just as up-to-date as often we run the program. It would be much
  -better if the mirror is not a static one we have to establish explicitly.
  -Instead we want a dynamic mirror with data which gets updated automatically
  -when there is need (updated data on the remote host).
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -To provide this feature we map the remote webpage or even the complete remote
  -webarea to our namespace by the use of the <I>Proxy Throughput</I> feature
  -(flag [P]):
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteRule    ^<strong>foo</strong>\.html$  <strong>bar</strong>.html  [<strong>R</strong>]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Browser Dependend Content</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>At least for important top-level pages it is sometimes
  +        necesarry to provide the optimum of browser dependend
  +        content, i.e. one has to provide a maximum version for the
  +        latest Netscape variants, a minimum version for the Lynx
  +        browsers and a average feature version for all others.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          We cannot use content negotiation because the browsers do
  +          not provide their type in that form. Instead we have to
  +          act on the HTTP header "User-Agent". The following condig
  +          does the following: If the HTTP header "User-Agent"
  +          begins with "Mozilla/3", the page <code>foo.html</code>
  +          is rewritten to <code>foo.NS.html</code> and and the
  +          rewriting stops. If the browser is "Lynx" or "Mozilla" of
  +          version 1 or 2 the URL becomes <code>foo.20.html</code>.
  +          All other browsers receive page <code>foo.32.html</code>.
  +          This is done by the following ruleset: 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
  +RewriteCond %{HTTP_USER_AGENT}  ^<strong>Mozilla/3</strong>.*
  +RewriteRule ^foo\.html$         foo.<strong>NS</strong>.html          [<strong>L</strong>]
  +
  +RewriteCond %{HTTP_USER_AGENT}  ^<strong>Lynx/</strong>.*         [OR]
  +RewriteCond %{HTTP_USER_AGENT}  ^<strong>Mozilla/[12]</strong>.*
  +RewriteRule ^foo\.html$         foo.<strong>20</strong>.html          [<strong>L</strong>]
  +
  +RewriteRule ^foo\.html$         foo.<strong>32</strong>.html          [<strong>L</strong>]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Dynamic Mirror</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>Assume there are nice webpages on remote hosts we want
  +        to bring into our namespace. For FTP servers we would use
  +        the <code>mirror</code> program which actually maintains an
  +        explicit up-to-date copy of the remote data on the local
  +        machine. For a webserver we could use the program
  +        <code>webcopy</code> which acts similar via HTTP. But both
  +        techniques have one major drawback: The local copy is
  +        always just as up-to-date as often we run the program. It
  +        would be much better if the mirror is not a static one we
  +        have to establish explicitly. Instead we want a dynamic
  +        mirror with data which gets updated automatically when
  +        there is need (updated data on the remote host).</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          To provide this feature we map the remote webpage or even
  +          the complete remote webarea to our namespace by the use
  +          of the <i>Proxy Throughput</i> feature (flag [P]): 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine  on
   RewriteBase    /~quux/
  -RewriteRule    ^<STRONG>hotsheet/</STRONG>(.*)$  <STRONG>http://www.tstimpreso.com/hotsheet/</STRONG>$1  [<STRONG>P</STRONG>]
  -</PRE></TD></TR></TABLE>
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteRule    ^<strong>hotsheet/</strong>(.*)$  <strong>http://www.tstimpreso.com/hotsheet/</strong>$1  [<strong>P</strong>]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine  on
   RewriteBase    /~quux/
  -RewriteRule    ^<STRONG>usa-news\.html</STRONG>$   <STRONG>http://www.quux-corp.com/news/index.html</STRONG>  [<STRONG>P</STRONG>]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Reverse Dynamic Mirror</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -...
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteRule    ^<strong>usa-news\.html</strong>$   <strong>http://www.quux-corp.com/news/index.html</strong>  [<strong>P</strong>]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Reverse Dynamic Mirror</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>...</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine on
   RewriteCond   /mirror/of/remotesite/$1           -U 
   RewriteRule   ^http://www\.remotesite\.com/(.*)$ /mirror/of/remotesite/$1
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Retrieve Missing Data from Intranet</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -This is a tricky way of virtually running a corporates (external) Internet
  -webserver (<CODE>www.quux-corp.dom</CODE>), while actually keeping and maintaining
  -its data on a (internal) Intranet webserver
  -(<CODE>www2.quux-corp.dom</CODE>) which is protected by a firewall.  The
  -trick is that on the external webserver we retrieve the requested data
  -on-the-fly from the internal one.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -First, we have to make sure that our firewall still protects the internal
  -webserver and that only the external webserver is allowed to retrieve data
  -from it. For a packet-filtering firewall we could for instance configure a
  -firewall ruleset like the following:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  -<STRONG>ALLOW</STRONG> Host www.quux-corp.dom Port &gt;1024 --&gt; Host www2.quux-corp.dom Port <STRONG>80</STRONG>  
  -<STRONG>DENY</STRONG>  Host *                 Port *     --&gt; Host www2.quux-corp.dom Port <STRONG>80</STRONG>
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -Just adjust it to your actual configuration syntax. Now we can establish the
  -mod_rewrite rules which request the missing data in the background through the
  -proxy throughput feature:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Retrieve Missing Data from Intranet</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>This is a tricky way of virtually running a corporates
  +        (external) Internet webserver
  +        (<code>www.quux-corp.dom</code>), while actually keeping
  +        and maintaining its data on a (internal) Intranet webserver
  +        (<code>www2.quux-corp.dom</code>) which is protected by a
  +        firewall. The trick is that on the external webserver we
  +        retrieve the requested data on-the-fly from the internal
  +        one.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          First, we have to make sure that our firewall still
  +          protects the internal webserver and that only the
  +          external webserver is allowed to retrieve data from it.
  +          For a packet-filtering firewall we could for instance
  +          configure a firewall ruleset like the following: 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
  +<strong>ALLOW</strong> Host www.quux-corp.dom Port &gt;1024 --&gt; Host www2.quux-corp.dom Port <strong>80</strong>  
  +<strong>DENY</strong>  Host *                 Port *     --&gt; Host www2.quux-corp.dom Port <strong>80</strong>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>Just adjust it to your actual configuration syntax.
  +          Now we can establish the mod_rewrite rules which request
  +          the missing data in the background through the proxy
  +          throughput feature:</p>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteRule ^/~([^/]+)/?(.*)          /home/$1/.www/$2
  -RewriteCond %{REQUEST_FILENAME}       <STRONG>!-f</STRONG>
  -RewriteCond %{REQUEST_FILENAME}       <STRONG>!-d</STRONG>
  -RewriteRule ^/home/([^/]+)/.www/?(.*) http://<STRONG>www2</STRONG>.quux-corp.dom/~$1/pub/$2 [<STRONG>P</STRONG>]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Load Balancing</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -Suppose we want to load balance the traffic to <CODE>www.foo.com</CODE> over
  -<CODE>www[0-5].foo.com</CODE> (a total of 6 servers). How can this be done?
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -There are a lot of possible solutions for this problem. We will discuss first
  -a commonly known DNS-based variant and then the special one with mod_rewrite:
  -
  -<ol>
  -<li><STRONG>DNS Round-Robin</STRONG>
  -
  -<P>
  -The simplest method for load-balancing is to use the DNS round-robin feature
  -of BIND. Here you just configure <CODE>www[0-9].foo.com</CODE> as usual in your
  -DNS with A(address) records, e.g.
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteCond %{REQUEST_FILENAME}       <strong>!-f</strong>
  +RewriteCond %{REQUEST_FILENAME}       <strong>!-d</strong>
  +RewriteRule ^/home/([^/]+)/.www/?(.*) http://<strong>www2</strong>.quux-corp.dom/~$1/pub/$2 [<strong>P</strong>]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Load Balancing</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>Suppose we want to load balance the traffic to
  +        <code>www.foo.com</code> over <code>www[0-5].foo.com</code>
  +        (a total of 6 servers). How can this be done?</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          There are a lot of possible solutions for this problem.
  +          We will discuss first a commonly known DNS-based variant
  +          and then the special one with mod_rewrite: 
  +
  +          <ol>
  +            <li>
  +              <strong>DNS Round-Robin</strong> 
  +
  +              <p>The simplest method for load-balancing is to use
  +              the DNS round-robin feature of BIND. Here you just
  +              configure <code>www[0-9].foo.com</code> as usual in
  +              your DNS with A(address) records, e.g.</p>
  +
  +              <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +              cellpadding="5">
  +                <tr>
  +                  <td>
  +<pre>
   www0   IN  A       1.2.3.1
   www1   IN  A       1.2.3.2
   www2   IN  A       1.2.3.3
   www3   IN  A       1.2.3.4
   www4   IN  A       1.2.3.5
   www5   IN  A       1.2.3.6
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -Then you additionally add the following entry:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +                  </td>
  +                </tr>
  +              </table>
  +
  +              <p>Then you additionally add the following entry:</p>
  +
  +              <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +              cellpadding="5">
  +                <tr>
  +                  <td>
  +<pre>
   www    IN  CNAME   www0.foo.com.
          IN  CNAME   www1.foo.com.
          IN  CNAME   www2.foo.com.
  @@ -1063,60 +1326,89 @@
          IN  CNAME   www4.foo.com.
          IN  CNAME   www5.foo.com.
          IN  CNAME   www6.foo.com.
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -Notice that this seems wrong, but is actually an intended feature of BIND and
  -can be used in this way. However, now when <CODE>www.foo.com</CODE> gets resolved,
  -BIND gives out <CODE>www0-www6</CODE> - but in a slightly permutated/rotated order
  -every time.  This way the clients are spread over the various servers.
  -
  -But notice that this not a perfect load balancing scheme, because DNS resolve
  -information gets cached by the other nameservers on the net, so once a client
  -has resolved <CODE>www.foo.com</CODE> to a particular <CODE>wwwN.foo.com</CODE>, all
  -subsequent requests also go to this particular name <CODE>wwwN.foo.com</CODE>. But
  -the final result is ok, because the total sum of the requests are really
  -spread over the various webservers.
  -
  -<P>
  -<li><STRONG>DNS Load-Balancing</STRONG>
  -
  -<P>
  -A sophisticated DNS-based method for load-balancing is to use the program
  -<CODE>lbnamed</CODE> which can be found at <A
  -HREF="http://www.stanford.edu/~schemers/docs/lbnamed/lbnamed.html">http://www.stanford.edu/~schemers/docs/lbnamed/lbnamed.html</A>.
  -It is a Perl 5 program in conjunction with auxilliary tools which provides a
  -real load-balancing for DNS.
  -
  -<P>
  -<li><STRONG>Proxy Throughput Round-Robin</STRONG>
  -
  -<P>
  -In this variant we use mod_rewrite and its proxy throughput feature.  First we
  -dedicate <CODE>www0.foo.com</CODE> to be actually <CODE>www.foo.com</CODE> by using a
  -single
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +                  </td>
  +                </tr>
  +              </table>
  +
  +              <p>Notice that this seems wrong, but is actually an
  +              intended feature of BIND and can be used in this way.
  +              However, now when <code>www.foo.com</code> gets
  +              resolved, BIND gives out <code>www0-www6</code> - but
  +              in a slightly permutated/rotated order every time.
  +              This way the clients are spread over the various
  +              servers. But notice that this not a perfect load
  +              balancing scheme, because DNS resolve information
  +              gets cached by the other nameservers on the net, so
  +              once a client has resolved <code>www.foo.com</code>
  +              to a particular <code>wwwN.foo.com</code>, all
  +              subsequent requests also go to this particular name
  +              <code>wwwN.foo.com</code>. But the final result is
  +              ok, because the total sum of the requests are really
  +              spread over the various webservers.</p>
  +            </li>
  +
  +            <li>
  +              <strong>DNS Load-Balancing</strong> 
  +
  +              <p>A sophisticated DNS-based method for
  +              load-balancing is to use the program
  +              <code>lbnamed</code> which can be found at <a
  +              href="http://www.stanford.edu/~schemers/docs/lbnamed/lbnamed.html">
  +              http://www.stanford.edu/~schemers/docs/lbnamed/lbnamed.html</a>.
  +              It is a Perl 5 program in conjunction with auxilliary
  +              tools which provides a real load-balancing for
  +              DNS.</p>
  +            </li>
  +
  +            <li>
  +              <strong>Proxy Throughput Round-Robin</strong> 
  +
  +              <p>In this variant we use mod_rewrite and its proxy
  +              throughput feature. First we dedicate
  +              <code>www0.foo.com</code> to be actually
  +              <code>www.foo.com</code> by using a single</p>
  +
  +              <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +              cellpadding="5">
  +                <tr>
  +                  <td>
  +<pre>
   www    IN  CNAME   www0.foo.com.
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -entry in the DNS. Then we convert <CODE>www0.foo.com</CODE> to a proxy-only
  -server, i.e. we configure this machine so all arriving URLs are just pushed
  -through the internal proxy to one of the 5 other servers (<CODE>www1-www5</CODE>).
  -To accomplish this we first establish a ruleset which contacts a load
  -balancing script <CODE>lb.pl</CODE> for all URLs.
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +                  </td>
  +                </tr>
  +              </table>
  +
  +              <p>entry in the DNS. Then we convert
  +              <code>www0.foo.com</code> to a proxy-only server,
  +              i.e. we configure this machine so all arriving URLs
  +              are just pushed through the internal proxy to one of
  +              the 5 other servers (<code>www1-www5</code>). To
  +              accomplish this we first establish a ruleset which
  +              contacts a load balancing script <code>lb.pl</code>
  +              for all URLs.</p>
  +
  +              <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +              cellpadding="5">
  +                <tr>
  +                  <td>
  +<pre>
   RewriteEngine on
   RewriteMap    lb      prg:/path/to/lb.pl
   RewriteRule   ^/(.+)$ ${lb:$1}           [P,L]
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -Then we write <CODE>lb.pl</CODE>:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +                  </td>
  +                </tr>
  +              </table>
  +
  +              <p>Then we write <code>lb.pl</code>:</p>
  +
  +              <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +              cellpadding="5">
  +                <tr>
  +                  <td>
  +<pre>
   #!/path/to/perl
   ##
   ##  lb.pl -- load balancing script
  @@ -1137,41 +1429,48 @@
   }
   
   ##EOF##
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -A last notice: Why is this useful? Seems like <CODE>www0.foo.com</CODE> still is
  -overloaded? The answer is yes, it is overloaded, but with plain proxy
  -throughput requests, only! All SSI, CGI, ePerl, etc. processing is completely
  -done on the other machines. This is the essential point.
  -
  -<P>
  -<li><STRONG>Hardware/TCP Round-Robin</STRONG>
  -
  -<P>
  -There is a hardware solution available, too. Cisco has a beast called
  -LocalDirector which does a load balancing at the TCP/IP level. Actually this
  -is some sort of a circuit level gateway in front of a webcluster.  If you have
  -enough money and really need a solution with high performance, use this one.
  -
  -</ol>
  -
  -</DL>
  -
  -<P>
  -<H2>Reverse Proxy</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -...
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +                  </td>
  +                </tr>
  +              </table>
  +
  +              <p>A last notice: Why is this useful? Seems like
  +              <code>www0.foo.com</code> still is overloaded? The
  +              answer is yes, it is overloaded, but with plain proxy
  +              throughput requests, only! All SSI, CGI, ePerl, etc.
  +              processing is completely done on the other machines.
  +              This is the essential point.</p>
  +            </li>
  +
  +            <li>
  +              <strong>Hardware/TCP Round-Robin</strong> 
  +
  +              <p>There is a hardware solution available, too. Cisco
  +              has a beast called LocalDirector which does a load
  +              balancing at the TCP/IP level. Actually this is some
  +              sort of a circuit level gateway in front of a
  +              webcluster. If you have enough money and really need
  +              a solution with high performance, use this one.</p>
  +            </li>
  +          </ol>
  +        </dd>
  +      </dl>
  +
  +      <h2>Reverse Proxy</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>...</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   ##
   ##  apache-rproxy.conf -- Apache configuration for Reverse Proxy Usage
   ##
  @@ -1256,9 +1555,16 @@
   ProxyPassReverse  /  http://www4.foo.dom/
   ProxyPassReverse  /  http://www5.foo.dom/
   ProxyPassReverse  /  http://www6.foo.dom/
  -</PRE></TD></TR></TABLE>
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   ##
   ##  apache-rproxy.conf-servers -- Apache/mod_rewrite selection table
   ##
  @@ -1270,182 +1576,227 @@
   #   list of backend servers which serve dynamically 
   #   generated page (CGI programs or mod_perl scripts)
   dynamic   www5.foo.dom|www6.foo.dom
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>New MIME-type, New Service</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -On the net there are a lot of nifty CGI programs. But their usage is usually
  -boring, so a lot of webmaster don't use them.  Even Apache's Action handler
  -feature for MIME-types is only appropriate when the CGI programs don't need
  -special URLs (actually PATH_INFO and QUERY_STRINGS) as their input. 
  -
  -First, let us configure a new file type with extension <CODE>.scgi</CODE>
  -(for secure CGI) which will be processed by the popular <CODE>cgiwrap</CODE>
  -program. The problem here is that for instance we use a Homogeneous URL Layout
  -(see above) a file inside the user homedirs has the URL
  -<CODE>/u/user/foo/bar.scgi</CODE>. But <CODE>cgiwrap</CODE> needs the URL in the form
  -<CODE>/~user/foo/bar.scgi/</CODE>. The following rule solves the problem:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  -RewriteRule ^/[uge]/<STRONG>([^/]+)</STRONG>/\.www/(.+)\.scgi(.*) ...
  -... /internal/cgi/user/cgiwrap/~<STRONG>$1</STRONG>/$2.scgi$3  [NS,<STRONG>T=application/x-http-cgi</STRONG>]
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -Or assume we have some more nifty programs:
  -<CODE>wwwlog</CODE> (which displays the <CODE>access.log</CODE> for a URL subtree and
  -<CODE>wwwidx</CODE> (which runs Glimpse on a URL subtree). We have to
  -provide the URL area to these programs so they know on which area
  -they have to act on. But usually this ugly, because they are all the
  -times still requested from that areas, i.e. typically we would run
  -the <CODE>swwidx</CODE> program from within <CODE>/u/user/foo/</CODE> via
  -hyperlink to
  -
  -<P><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>New MIME-type, New Service</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>
  +          On the net there are a lot of nifty CGI programs. But
  +          their usage is usually boring, so a lot of webmaster
  +          don't use them. Even Apache's Action handler feature for
  +          MIME-types is only appropriate when the CGI programs
  +          don't need special URLs (actually PATH_INFO and
  +          QUERY_STRINGS) as their input. First, let us configure a
  +          new file type with extension <code>.scgi</code> (for
  +          secure CGI) which will be processed by the popular
  +          <code>cgiwrap</code> program. The problem here is that
  +          for instance we use a Homogeneous URL Layout (see above)
  +          a file inside the user homedirs has the URL
  +          <code>/u/user/foo/bar.scgi</code>. But
  +          <code>cgiwrap</code> needs the URL in the form
  +          <code>/~user/foo/bar.scgi/</code>. The following rule
  +          solves the problem: 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
  +RewriteRule ^/[uge]/<strong>([^/]+)</strong>/\.www/(.+)\.scgi(.*) ...
  +... /internal/cgi/user/cgiwrap/~<strong>$1</strong>/$2.scgi$3  [NS,<strong>T=application/x-http-cgi</strong>]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>Or assume we have some more nifty programs:
  +          <code>wwwlog</code> (which displays the
  +          <code>access.log</code> for a URL subtree and
  +          <code>wwwidx</code> (which runs Glimpse on a URL
  +          subtree). We have to provide the URL area to these
  +          programs so they know on which area they have to act on.
  +          But usually this ugly, because they are all the times
  +          still requested from that areas, i.e. typically we would
  +          run the <code>swwidx</code> program from within
  +          <code>/u/user/foo/</code> via hyperlink to</p>
  +<pre>
   /internal/cgi/user/swwidx?i=/u/user/foo/
  -</PRE><P>
  +</pre>
   
  -which is ugly. Because we have to hard-code <STRONG>both</STRONG> the location of the
  -area <STRONG>and</STRONG> the location of the CGI inside the hyperlink. When we have to
  -reorganise or area, we spend a lot of time changing the various hyperlinks.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -The solution here is to provide a special new URL format which automatically
  -leads to the proper CGI invocation. We configure the following:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +          <p>which is ugly. Because we have to hard-code
  +          <strong>both</strong> the location of the area
  +          <strong>and</strong> the location of the CGI inside the
  +          hyperlink. When we have to reorganise or area, we spend a
  +          lot of time changing the various hyperlinks.</p>
  +        </dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          The solution here is to provide a special new URL format
  +          which automatically leads to the proper CGI invocation.
  +          We configure the following: 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteRule   ^/([uge])/([^/]+)(/?.*)/\*  /internal/cgi/user/wwwidx?i=/$1/$2$3/
   RewriteRule   ^/([uge])/([^/]+)(/?.*):log /internal/cgi/user/wwwlog?f=/$1/$2$3
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -Now the hyperlink to search at <CODE>/u/user/foo/</CODE> reads only
  -
  -<P><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>Now the hyperlink to search at
  +          <code>/u/user/foo/</code> reads only</p>
  +<pre>
   HREF="*"
  -</PRE><P>
  -
  -which internally gets automatically transformed to 
  +</pre>
   
  -<P><PRE>
  +          <p>which internally gets automatically transformed to</p>
  +<pre>
   /internal/cgi/user/wwwidx?i=/u/user/foo/
  -</PRE><P>
  -
  -The same approach leads to an invocation for the access log CGI
  -program when the hyperlink <CODE>:log</CODE> gets used.
  -
  -</DL>
  -
  -<P>
  -<H2>From Static to Dynamic</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -How can we transform a static page <CODE>foo.html</CODE> into a dynamic variant
  -<CODE>foo.cgi</CODE> in a seemless way, i.e.  without notice by the browser/user.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -We just rewrite the URL to the CGI-script and force the correct MIME-type so
  -it gets really run as a CGI-script. This way a request to
  -<CODE>/~quux/foo.html</CODE> internally leads to the invokation of
  -<CODE>/~quux/foo.cgi</CODE>.
  +</pre>
   
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +          <p>The same approach leads to an invocation for the
  +          access log CGI program when the hyperlink
  +          <code>:log</code> gets used.</p>
  +        </dd>
  +      </dl>
  +
  +      <h2>From Static to Dynamic</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>How can we transform a static page
  +        <code>foo.html</code> into a dynamic variant
  +        <code>foo.cgi</code> in a seemless way, i.e. without notice
  +        by the browser/user.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          We just rewrite the URL to the CGI-script and force the
  +          correct MIME-type so it gets really run as a CGI-script.
  +          This way a request to <code>/~quux/foo.html</code>
  +          internally leads to the invokation of
  +          <code>/~quux/foo.cgi</code>. 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine  on
   RewriteBase    /~quux/
  -RewriteRule    ^foo\.<STRONG>html</STRONG>$  foo.<STRONG>cgi</STRONG>  [T=<STRONG>application/x-httpd-cgi</STRONG>]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>On-the-fly Content-Regeneration</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -Here comes a really esoteric feature: Dynamically generated but statically
  -served pages, i.e. pages should be delivered as pure static pages (read from
  -the filesystem and just passed through), but they have to be generated
  -dynamically by the webserver if missing. This way you can have CGI-generated
  -pages which are statically served unless one (or a cronjob) removes the static
  -contents. Then the contents gets refreshed.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -This is done via the following ruleset:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  -RewriteCond %{REQUEST_FILENAME}   <STRONG>!-s</STRONG>
  -RewriteRule ^page\.<STRONG>html</STRONG>$          page.<STRONG>cgi</STRONG>   [T=application/x-httpd-cgi,L]
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -Here a request to <CODE>page.html</CODE> leads to a internal run of a
  -corresponding <CODE>page.cgi</CODE> if <CODE>page.html</CODE> is still missing or has
  -filesize null. The trick here is that <CODE>page.cgi</CODE> is a usual CGI script
  -which (additionally to its STDOUT) writes its output to the file
  -<CODE>page.html</CODE>. Once it was run, the server sends out the data of
  -<CODE>page.html</CODE>. When the webmaster wants to force a refresh the contents,
  -he just removes <CODE>page.html</CODE> (usually done by a cronjob).
  -
  -</DL>
  -
  -<P>
  -<H2>Document With Autorefresh</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -Wouldn't it be nice while creating a complex webpage if the webbrowser would
  -automatically refresh the page every time we write a new version from within
  -our editor? Impossible?
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -No! We just combine the MIME multipart feature, the webserver NPH feature and
  -the URL manipulation power of mod_rewrite. First, we establish a new URL
  -feature: Adding just <CODE>:refresh</CODE> to any URL causes this to be refreshed
  -every time it gets updated on the filesystem.
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteRule    ^foo\.<strong>html</strong>$  foo.<strong>cgi</strong>  [T=<strong>application/x-httpd-cgi</strong>]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>On-the-fly Content-Regeneration</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>Here comes a really esoteric feature: Dynamically
  +        generated but statically served pages, i.e. pages should be
  +        delivered as pure static pages (read from the filesystem
  +        and just passed through), but they have to be generated
  +        dynamically by the webserver if missing. This way you can
  +        have CGI-generated pages which are statically served unless
  +        one (or a cronjob) removes the static contents. Then the
  +        contents gets refreshed.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          This is done via the following ruleset: 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
  +RewriteCond %{REQUEST_FILENAME}   <strong>!-s</strong>
  +RewriteRule ^page\.<strong>html</strong>$          page.<strong>cgi</strong>   [T=application/x-httpd-cgi,L]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>Here a request to <code>page.html</code> leads to a
  +          internal run of a corresponding <code>page.cgi</code> if
  +          <code>page.html</code> is still missing or has filesize
  +          null. The trick here is that <code>page.cgi</code> is a
  +          usual CGI script which (additionally to its STDOUT)
  +          writes its output to the file <code>page.html</code>.
  +          Once it was run, the server sends out the data of
  +          <code>page.html</code>. When the webmaster wants to force
  +          a refresh the contents, he just removes
  +          <code>page.html</code> (usually done by a cronjob).</p>
  +        </dd>
  +      </dl>
  +
  +      <h2>Document With Autorefresh</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>Wouldn't it be nice while creating a complex webpage if
  +        the webbrowser would automatically refresh the page every
  +        time we write a new version from within our editor?
  +        Impossible?</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          No! We just combine the MIME multipart feature, the
  +          webserver NPH feature and the URL manipulation power of
  +          mod_rewrite. First, we establish a new URL feature:
  +          Adding just <code>:refresh</code> to any URL causes this
  +          to be refreshed every time it gets updated on the
  +          filesystem. 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteRule   ^(/[uge]/[^/]+/?.*):refresh  /internal/cgi/apache/nph-refresh?f=$1
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -Now when we reference the URL
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
   
  -<P><PRE>
  +          <p>Now when we reference the URL</p>
  +<pre>
   /u/foo/bar/page.html:refresh
  -</PRE><P>
  +</pre>
   
  -this leads to the internal invocation of the URL
  -
  -<P><PRE>
  +          <p>this leads to the internal invocation of the URL</p>
  +<pre>
   /internal/cgi/apache/nph-refresh?f=/u/foo/bar/page.html
  -</PRE><P>
  -
  -The only missing part is the NPH-CGI script. Although one would usually say
  -"left as an exercise to the reader" ;-) I will provide this, too.
  +</pre>
   
  -<P><PRE>
  +          <p>The only missing part is the NPH-CGI script. Although
  +          one would usually say "left as an exercise to the reader"
  +          ;-) I will provide this, too.</p>
  +<pre>
   #!/sw/bin/perl
   ##
   ##  nph-refresh -- NPH/CGI script for auto refreshing pages
  @@ -1547,29 +1898,33 @@
   exit(0);
   
   ##EOF##
  -</PRE>
  -
  -</DL>
  -
  -<P>
  -<H2>Mass Virtual Hosting</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -The <CODE>&lt;VirtualHost&gt;</CODE> feature of Apache is nice and works great
  -when you just have a few dozens virtual hosts. But when you are an ISP and
  -have hundreds of virtual hosts to provide this feature is not the best choice.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -To provide this feature we map the remote webpage or even the complete remote
  -webarea to our namespace by the use of the <I>Proxy Throughput</I> feature
  -(flag [P]):
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +        </dd>
  +      </dl>
  +
  +      <h2>Mass Virtual Hosting</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>The <code>&lt;VirtualHost&gt;</code> feature of Apache
  +        is nice and works great when you just have a few dozens
  +        virtual hosts. But when you are an ISP and have hundreds of
  +        virtual hosts to provide this feature is not the best
  +        choice.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          To provide this feature we map the remote webpage or even
  +          the complete remote webarea to our namespace by the use
  +          of the <i>Proxy Throughput</i> feature (flag [P]): 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   ##
   ##  vhost.map 
   ## 
  @@ -1577,9 +1932,16 @@
   www.vhost2.dom:80  /path/to/docroot/vhost2
        :
   www.vhostN.dom:80  /path/to/docroot/vhostN
  -</PRE></TD></TR></TABLE>
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   ##
   ##  httpd.conf
   ##
  @@ -1627,101 +1989,135 @@
   #      and remember the virtual host for logging puposes
   RewriteRule   ^/(.*)$   %1/$1  [E=VHOST:${lowercase:%{HTTP_HOST}}]
       : 
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<H1>Access Restriction</H1>
  -
  -<P>
  -<H2>Blocking of Robots</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -How can we block a really annoying robot from retrieving pages of a specific
  -webarea? A <CODE>/robots.txt</CODE> file containing entries of the "Robot
  -Exclusion Protocol" is typically not enough to get rid of such a robot.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -We use a ruleset which forbids the URLs of the webarea
  -<CODE>/~quux/foo/arc/</CODE> (perhaps a very deep directory indexed area where the
  -robot traversal would create big server load).   We have to make sure that we
  -forbid access only to the particular robot, i.e. just forbidding the host
  -where the robot runs is not enough. This would block users from this host,
  -too. We accomplish this by also matching the User-Agent HTTP header
  -information.
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  -RewriteCond %{HTTP_USER_AGENT}   ^<STRONG>NameOfBadRobot</STRONG>.*      
  -RewriteCond %{REMOTE_ADDR}       ^<STRONG>123\.45\.67\.[8-9]</STRONG>$
  -RewriteRule ^<STRONG>/~quux/foo/arc/</STRONG>.+   -   [<STRONG>F</STRONG>]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Blocked Inline-Images</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -Assume we have under http://www.quux-corp.de/~quux/ some pages with inlined
  -GIF graphics. These graphics are nice, so others directly incorporate them via
  -hyperlinks to their pages. We don't like this practice because it adds useless
  -traffic to our server.
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -While we cannot 100% protect the images from inclusion, we
  -can at least restrict the cases where the browser sends
  -a HTTP Referer header.
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  -RewriteCond %{HTTP_REFERER} <STRONG>!^$</STRONG>                                  
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h1>Access Restriction</h1>
  +
  +      <h2>Blocking of Robots</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>How can we block a really annoying robot from
  +        retrieving pages of a specific webarea? A
  +        <code>/robots.txt</code> file containing entries of the
  +        "Robot Exclusion Protocol" is typically not enough to get
  +        rid of such a robot.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          We use a ruleset which forbids the URLs of the webarea
  +          <code>/~quux/foo/arc/</code> (perhaps a very deep
  +          directory indexed area where the robot traversal would
  +          create big server load). We have to make sure that we
  +          forbid access only to the particular robot, i.e. just
  +          forbidding the host where the robot runs is not enough.
  +          This would block users from this host, too. We accomplish
  +          this by also matching the User-Agent HTTP header
  +          information. 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
  +RewriteCond %{HTTP_USER_AGENT}   ^<strong>NameOfBadRobot</strong>.*      
  +RewriteCond %{REMOTE_ADDR}       ^<strong>123\.45\.67\.[8-9]</strong>$
  +RewriteRule ^<strong>/~quux/foo/arc/</strong>.+   -   [<strong>F</strong>]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Blocked Inline-Images</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>Assume we have under http://www.quux-corp.de/~quux/
  +        some pages with inlined GIF graphics. These graphics are
  +        nice, so others directly incorporate them via hyperlinks to
  +        their pages. We don't like this practice because it adds
  +        useless traffic to our server.</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          While we cannot 100% protect the images from inclusion,
  +          we can at least restrict the cases where the browser
  +          sends a HTTP Referer header. 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
  +RewriteCond %{HTTP_REFERER} <strong>!^$</strong>                                  
   RewriteCond %{HTTP_REFERER} !^http://www.quux-corp.de/~quux/.*$ [NC]
  -RewriteRule <STRONG>.*\.gif$</STRONG>        -                                    [F]
  -</PRE></TD></TR></TABLE>
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteRule <strong>.*\.gif$</strong>        -                                    [F]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteCond %{HTTP_REFERER}         !^$                                  
   RewriteCond %{HTTP_REFERER}         !.*/foo-with-gif\.html$
  -RewriteRule <STRONG>^inlined-in-foo\.gif$</STRONG>   -                        [F]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Host Deny</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -How can we forbid a list of externally configured hosts from using our server?
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -
  -For Apache &gt;= 1.3b6:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteRule <strong>^inlined-in-foo\.gif$</strong>   -                        [F]
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Host Deny</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>How can we forbid a list of externally configured hosts
  +        from using our server?</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          For Apache &gt;= 1.3b6: 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine on
   RewriteMap    hosts-deny  txt:/path/to/hosts.deny
   RewriteCond   ${hosts-deny:%{REMOTE_HOST}|NOT-FOUND} !=NOT-FOUND [OR]
   RewriteCond   ${hosts-deny:%{REMOTE_ADDR}|NOT-FOUND} !=NOT-FOUND
   RewriteRule   ^/.*  -  [F]
  -</PRE></TD></TR></TABLE><P>
  -
  -For Apache &lt;= 1.3b6:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>For Apache &lt;= 1.3b6:</p>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine on
   RewriteMap    hosts-deny  txt:/path/to/hosts.deny
   RewriteRule   ^/(.*)$ ${hosts-deny:%{REMOTE_HOST}|NOT-FOUND}/$1
  @@ -1729,9 +2125,16 @@
   RewriteRule   ^NOT-FOUND/(.*)$ ${hosts-deny:%{REMOTE_ADDR}|NOT-FOUND}/$1 
   RewriteRule   !^NOT-FOUND/.* - [F]
   RewriteRule   ^NOT-FOUND/(.*)$ /$1
  -</PRE></TD></TR></TABLE>
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   ##
   ##  hosts.deny 
   ##
  @@ -1743,84 +2146,110 @@
   193.102.180.41 -
   bsdti1.sdm.de  -
   192.76.162.40  -
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Proxy Deny</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -How can we forbid a certain host or even a user of a special host from using
  -the Apache proxy?
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -We first have to make sure mod_rewrite is below(!) mod_proxy in the
  -<CODE>Configuration</CODE> file when compiling the Apache webserver.  This way it
  -gets called _before_ mod_proxy. Then we configure the following for a
  -host-dependend deny...
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  -RewriteCond %{REMOTE_HOST} <STRONG>^badhost\.mydomain\.com$</STRONG> 
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Proxy Deny</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>How can we forbid a certain host or even a user of a
  +        special host from using the Apache proxy?</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          We first have to make sure mod_rewrite is below(!)
  +          mod_proxy in the <code>Configuration</code> file when
  +          compiling the Apache webserver. This way it gets called
  +          _before_ mod_proxy. Then we configure the following for a
  +          host-dependend deny... 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
  +RewriteCond %{REMOTE_HOST} <strong>^badhost\.mydomain\.com$</strong> 
   RewriteRule !^http://[^/.]\.mydomain.com.*  - [F]
  -</PRE></TD></TR></TABLE>
  -
  -<P>...and this one for a user@host-dependend deny:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  -RewriteCond %{REMOTE_IDENT}@%{REMOTE_HOST}  <STRONG>^badguy@badhost\.mydomain\.com$</STRONG>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>...and this one for a user@host-dependend deny:</p>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
  +RewriteCond %{REMOTE_IDENT}@%{REMOTE_HOST}  <strong>^badguy@badhost\.mydomain\.com$</strong>
   RewriteRule !^http://[^/.]\.mydomain.com.*  - [F]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Special Authentication Variant</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -Sometimes a very special authentication is needed, for instance a
  -authentication which checks for a set of explicitly configured users. Only
  -these should receive access and without explicit prompting (which would occur
  -when using the Basic Auth via mod_access).
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -We use a list of rewrite conditions to exclude all except our friends:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  -RewriteCond %{REMOTE_IDENT}@%{REMOTE_HOST} <STRONG>!^friend1@client1.quux-corp\.com$</STRONG> 
  -RewriteCond %{REMOTE_IDENT}@%{REMOTE_HOST} <STRONG>!^friend2</STRONG>@client2.quux-corp\.com$ 
  -RewriteCond %{REMOTE_IDENT}@%{REMOTE_HOST} <STRONG>!^friend3</STRONG>@client3.quux-corp\.com$ 
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Special Authentication Variant</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>Sometimes a very special authentication is needed, for
  +        instance a authentication which checks for a set of
  +        explicitly configured users. Only these should receive
  +        access and without explicit prompting (which would occur
  +        when using the Basic Auth via mod_access).</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          We use a list of rewrite conditions to exclude all except
  +          our friends: 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
  +RewriteCond %{REMOTE_IDENT}@%{REMOTE_HOST} <strong>!^friend1@client1.quux-corp\.com$</strong> 
  +RewriteCond %{REMOTE_IDENT}@%{REMOTE_HOST} <strong>!^friend2</strong>@client2.quux-corp\.com$ 
  +RewriteCond %{REMOTE_IDENT}@%{REMOTE_HOST} <strong>!^friend3</strong>@client3.quux-corp\.com$ 
   RewriteRule ^/~quux/only-for-friends/      -                                 [F]
  -</PRE></TD></TR></TABLE>
  -
  -</DL>
  -
  -<P>
  -<H2>Referer-based Deflector</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -How can we program a flexible URL Deflector which acts on the "Referer" HTTP
  -header and can be configured with as many referring pages as we like?
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -Use the following really tricky ruleset...
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +        </dd>
  +      </dl>
  +
  +      <h2>Referer-based Deflector</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>How can we program a flexible URL Deflector which acts
  +        on the "Referer" HTTP header and can be configured with as
  +        many referring pages as we like?</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          Use the following really tricky ruleset... 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteMap  deflector txt:/path/to/deflector.map
   
   RewriteCond %{HTTP_REFERER} !=""
  @@ -1830,12 +2259,19 @@
   RewriteCond %{HTTP_REFERER} !=""
   RewriteCond ${deflector:%{HTTP_REFERER}|NOT-FOUND} !=NOT-FOUND
   RewriteRule ^.* ${deflector:%{HTTP_REFERER}} [R,L]
  -</PRE></TD></TR></TABLE>
  -
  -<P>...
  -in conjunction with a corresponding rewrite map:
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>... in conjunction with a corresponding rewrite
  +          map:</p>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   ##
   ##  deflector.map
   ##
  @@ -1843,41 +2279,55 @@
   http://www.badguys.com/bad/index.html    -
   http://www.badguys.com/bad/index2.html   -
   http://www.badguys.com/bad/index3.html   http://somewhere.com/
  -</PRE></TD></TR></TABLE>
  -
  -<P>
  -This automatically redirects the request back to the referring page (when "-"
  -is used as the value in the map) or to a specific URL (when an URL is
  -specified in the map as the second argument).
  -
  -</DL>
  -
  -<H1>Other</H1>
  -
  -<P>
  -<H2>External Rewriting Engine</H2>
  -<P>
  -
  -<DL>
  -<DT><STRONG>Description:</STRONG>
  -<DD>
  -A FAQ: How can we solve the FOO/BAR/QUUX/etc. problem? There seems no solution
  -by the use of mod_rewrite...
  -
  -<P>
  -<DT><STRONG>Solution:</STRONG>
  -<DD>
  -Use an external rewrite map, i.e. a program which acts like a rewrite map.  It
  -is run once on startup of Apache receives the requested URLs on STDIN and has
  -to put the resulting (usually rewritten) URL on STDOUT (same order!).
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>This automatically redirects the request back to the
  +          referring page (when "-" is used as the value in the map)
  +          or to a specific URL (when an URL is specified in the map
  +          as the second argument).</p>
  +        </dd>
  +      </dl>
  +
  +      <h1>Other</h1>
  +
  +      <h2>External Rewriting Engine</h2>
  +
  +      <dl>
  +        <dt><strong>Description:</strong></dt>
  +
  +        <dd>A FAQ: How can we solve the FOO/BAR/QUUX/etc. problem?
  +        There seems no solution by the use of mod_rewrite...</dd>
  +
  +        <dt><strong>Solution:</strong></dt>
  +
  +        <dd>
  +          Use an external rewrite map, i.e. a program which acts
  +          like a rewrite map. It is run once on startup of Apache
  +          receives the requested URLs on STDIN and has to put the
  +          resulting (usually rewritten) URL on STDOUT (same
  +          order!). 
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   RewriteEngine on
  -RewriteMap    quux-map       <STRONG>prg:</STRONG>/path/to/map.quux.pl
  -RewriteRule   ^/~quux/(.*)$  /~quux/<STRONG>${quux-map:$1}</STRONG>
  -</PRE></TD></TR></TABLE>
  -
  -<P><TABLE BGCOLOR="#E0E5F5" BORDER="0" CELLSPACING="0" CELLPADDING="5"><TR><TD><PRE>
  +RewriteMap    quux-map       <strong>prg:</strong>/path/to/map.quux.pl
  +RewriteRule   ^/~quux/(.*)$  /~quux/<strong>${quux-map:$1}</strong>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <table bgcolor="#E0E5F5" border="0" cellspacing="0"
  +          cellpadding="5">
  +            <tr>
  +              <td>
  +<pre>
   #!/path/to/perl
   
   #   disable buffered I/O which would lead 
  @@ -1890,17 +2340,21 @@
       s|^foo/|bar/|;
       print $_;
   }
  -</PRE></TD></TR></TABLE>
  +</pre>
  +              </td>
  +            </tr>
  +          </table>
  +
  +          <p>This is a demonstration-only example and just rewrites
  +          all URLs <code>/~quux/foo/...</code> to
  +          <code>/~quux/bar/...</code>. Actually you can program
  +          whatever you like. But notice that while such maps can be
  +          <strong>used</strong> also by an average user, only the
  +          system administrator can <strong>define</strong> it.</p>
  +        </dd>
  +      </dl>
  +      <!--#include virtual="footer.html" -->
  +    </blockquote>
  +  </body>
  +</html>
   
  -<P>
  -This is a demonstration-only example and just rewrites all URLs
  -<CODE>/~quux/foo/...</CODE> to <CODE>/~quux/bar/...</CODE>. Actually you can program
  -whatever you like. But notice that while such maps can be <STRONG>used</STRONG> also by
  -an average user, only the system administrator can <STRONG>define</STRONG> it.
  -
  -</DL>
  -
  -<!--#include virtual="footer.html" -->
  -</BLOCKQUOTE>
  -</BODY>
  -</HTML>
  
  
  
  1.23      +210 -206  httpd-2.0/docs/manual/misc/security_tips.html
  
  Index: security_tips.html
  ===================================================================
  RCS file: /home/cvs/httpd-2.0/docs/manual/misc/security_tips.html,v
  retrieving revision 1.22
  retrieving revision 1.23
  diff -u -r1.22 -r1.23
  --- security_tips.html	2000/09/12 15:16:47	1.22
  +++ security_tips.html	2001/09/22 19:33:40	1.23
  @@ -1,183 +1,194 @@
  -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
  -<HTML>
  -<HEAD>
  -<TITLE>Apache HTTP Server: Security Tips</TITLE>
  -</HEAD>
  -
  -<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  -<BODY
  - BGCOLOR="#FFFFFF"
  - TEXT="#000000"
  - LINK="#0000FF"
  - VLINK="#000080"
  - ALINK="#FF0000"
  ->
  -<!--#include virtual="header.html" -->
  -<H1 ALIGN="CENTER">Security Tips for Server Configuration</H1>
  -
  -<HR>
  -
  -<P>Some hints and tips on security issues in setting up a web server. Some of
  -the suggestions will be general, others specific to Apache.
  -
  -<HR>
  -
  -<H2><A NAME="serverroot">Permissions on ServerRoot Directories</A></H2>
  -<P>In typical operation, Apache is started by the root
  -user, and it switches to the user defined by the <A
  -HREF="../mod/core.html#user"><STRONG>User</STRONG></A> directive to serve hits.
  -As is the case with any command that root executes, you must take care
  -that it is protected from modification by non-root users.  Not only
  -must the files themselves be writeable only by root, but so must the
  -directories, and parents of all directories.  For example, if you
  -choose to place ServerRoot in <CODE>/usr/local/apache</CODE> then it is
  -suggested that you create that directory as root, with commands
  -like these:
  +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  +    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
   
  -<BLOCKQUOTE><PRE>
  +<html xmlns="http://www.w3.org/1999/xhtml">
  +  <head>
  +    <meta name="generator" content="HTML Tidy, see www.w3.org" />
  +
  +    <title>Apache HTTP Server: Security Tips</title>
  +  </head>
  +  <!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  +
  +  <body bgcolor="#FFFFFF" text="#000000" link="#0000FF"
  +  vlink="#000080" alink="#FF0000">
  +    <!--#include virtual="header.html" -->
  +
  +    <h1 align="CENTER">Security Tips for Server Configuration</h1>
  +    <hr />
  +
  +    <p>Some hints and tips on security issues in setting up a web
  +    server. Some of the suggestions will be general, others
  +    specific to Apache.</p>
  +    <hr />
  +
  +    <h2><a id="serverroot" name="serverroot">Permissions on
  +    ServerRoot Directories</a></h2>
  +
  +    <p>In typical operation, Apache is started by the root user,
  +    and it switches to the user defined by the <a
  +    href="../mod/core.html#user"><strong>User</strong></a>
  +    directive to serve hits. As is the case with any command that
  +    root executes, you must take care that it is protected from
  +    modification by non-root users. Not only must the files
  +    themselves be writeable only by root, but so must the
  +    directories, and parents of all directories. For example, if
  +    you choose to place ServerRoot in
  +    <code>/usr/local/apache</code> then it is suggested that you
  +    create that directory as root, with commands like these:</p>
  +
  +    <blockquote>
  +<pre>
       mkdir /usr/local/apache
       cd /usr/local/apache
       mkdir bin conf logs
       chown 0 . bin conf logs
       chgrp 0 . bin conf logs
       chmod 755 . bin conf logs
  -</PRE></BLOCKQUOTE>
  -
  -It is assumed that /, /usr, and /usr/local are only modifiable by root.
  -When you install the httpd executable, you should ensure that it is
  -similarly protected:
  +</pre>
  +    </blockquote>
  +    It is assumed that /, /usr, and /usr/local are only modifiable
  +    by root. When you install the httpd executable, you should
  +    ensure that it is similarly protected: 
   
  -<BLOCKQUOTE><PRE>
  +    <blockquote>
  +<pre>
       cp httpd /usr/local/apache/bin
       chown 0 /usr/local/apache/bin/httpd
       chgrp 0 /usr/local/apache/bin/httpd
       chmod 511 /usr/local/apache/bin/httpd
  -</PRE></BLOCKQUOTE>
  +</pre>
  +    </blockquote>
   
  -<P>You can create an htdocs subdirectory which is modifiable by other
  -users -- since root never executes any files out of there, and shouldn't
  -be creating files in there.
  -
  -<P>If you allow non-root users to modify any files that root either
  -executes or writes on then you open your system to root compromises.
  -For example, someone could replace the httpd binary so that the next
  -time you start it, it will execute some arbitrary code.  If the logs
  -directory is writeable (by a non-root user), someone
  -could replace a log file with a symlink to some other system file,
  -and then root might overwrite that file with arbitrary data.  If the
  -log files themselves are writeable (by a non-root user), then someone
  -may be able to overwrite the log itself with bogus data.
  -<P>
  -<HR>
  -<H2>Server Side Includes</H2>
  -<P>Server side includes (SSI) can be configured so that users can execute
  -arbitrary programs on the server. That thought alone should send a shiver
  -down the spine of any sys-admin.<P>
  -
  -One solution is to disable that part of SSI. To do that you use the
  -IncludesNOEXEC option to the <A HREF="../mod/core.html#options">Options</A>
  -directive.<P>
  -
  -<HR>
  -
  -<H2>Non Script Aliased CGI</H2>
  -<P>Allowing users to execute <STRONG>CGI</STRONG> scripts in any directory
  -should only
  -be considered if;
  -<OL>
  - <LI>You trust your users not to write scripts which will deliberately or
  -accidentally expose your system to an attack.
  - <LI>You consider security at your site to be so feeble in other areas, as to
  -make one more potential hole irrelevant.
  - <LI>You have no users, and nobody ever visits your server.
  -</OL><P>
  -<HR>
  -
  -<H2>Script Alias'ed CGI</H2>
  -<P>Limiting <STRONG>CGI</STRONG> to special directories gives the admin
  -control over
  -what goes into those directories. This is inevitably more secure than
  -non script aliased CGI, but <STRONG>only if users with write access to the
  -directories are trusted</STRONG> or the admin is willing to test each new CGI
  -script/program for potential security holes.<P>
  -
  -Most sites choose this option over the non script aliased CGI approach.<P>
  -
  -<HR>
  -<H2>CGI in general</H2>
  -<P>Always remember that you must trust the writers of the CGI script/programs
  -or your ability to spot potential security holes in CGI, whether they were
  -deliberate or accidental.<P>
  -
  -All the CGI scripts will run as the same user, so they have potential to
  -conflict (accidentally or deliberately) with other scripts <EM>e.g.</EM>
  -User A hates User B, so he writes a script to trash User B's CGI
  -database.  One program which can be used to allow scripts to run
  -as different users is <A HREF="../suexec.html">suEXEC</A> which is
  -included with Apache as of 1.2 and is called from special hooks in
  -the Apache server code.  Another popular way of doing this is with
  -<A HREF="http://wwwcgi.umr.edu/~cgiwrap/">CGIWrap</A>.  <P>
  -
  -<HR>
  -
  -
  -<H2>Stopping users overriding system wide settings...</H2>
  -<P>To run a really tight ship, you'll want to stop users from setting
  -up <CODE>.htaccess</CODE> files which can override security features
  -you've configured. Here's one way to do it...<P>
  -
  -In the server configuration file, put
  -<BLOCKQUOTE><CODE>
  -&lt;Directory /&gt; <BR>
  -AllowOverride None <BR>
  -Options None <BR>
  -Allow from all <BR>
  -&lt;/Directory&gt; <BR>
  -</CODE></BLOCKQUOTE>
  -
  -Then setup for specific directories<P>
  -
  -This stops all overrides, Includes and accesses in all directories apart
  -from those named.<P>
  -<HR>
  -<H2>
  - Protect server files by default
  -</H2>
  -<P>
  -One aspect of Apache which is occasionally misunderstood is the feature
  -of default access.  That is, unless you take steps to change it, if the
  -server can find its way to a file through normal URL mapping rules, it
  -can serve it to clients.
  -</P>
  -<P>
  -For instance, consider the following example:
  -</P>
  -<OL>
  - <LI><SAMP># cd /; ln -s / public_html</SAMP>
  - </LI>
  - <LI>Accessing <SAMP>http://localhost/~root/</SAMP>
  - </LI>
  -</OL>
  -<P>
  -This would allow clients to walk through the entire filesystem.  To work
  -around this, add the following block to your server's configuration:
  -</P>
  -<PRE>
  +    <p>You can create an htdocs subdirectory which is modifiable by
  +    other users -- since root never executes any files out of
  +    there, and shouldn't be creating files in there.</p>
  +
  +    <p>If you allow non-root users to modify any files that root
  +    either executes or writes on then you open your system to root
  +    compromises. For example, someone could replace the httpd
  +    binary so that the next time you start it, it will execute some
  +    arbitrary code. If the logs directory is writeable (by a
  +    non-root user), someone could replace a log file with a symlink
  +    to some other system file, and then root might overwrite that
  +    file with arbitrary data. If the log files themselves are
  +    writeable (by a non-root user), then someone may be able to
  +    overwrite the log itself with bogus data.</p>
  +    <hr />
  +
  +    <h2>Server Side Includes</h2>
  +
  +    <p>Server side includes (SSI) can be configured so that users
  +    can execute arbitrary programs on the server. That thought
  +    alone should send a shiver down the spine of any sys-admin.</p>
  +
  +    <p>One solution is to disable that part of SSI. To do that you
  +    use the IncludesNOEXEC option to the <a
  +    href="../mod/core.html#options">Options</a> directive.</p>
  +    <hr />
  +
  +    <h2>Non Script Aliased CGI</h2>
  +
  +    <p>Allowing users to execute <strong>CGI</strong> scripts in
  +    any directory should only be considered if;</p>
  +
  +    <ol>
  +      <li>You trust your users not to write scripts which will
  +      deliberately or accidentally expose your system to an
  +      attack.</li>
  +
  +      <li>You consider security at your site to be so feeble in
  +      other areas, as to make one more potential hole
  +      irrelevant.</li>
  +
  +      <li>You have no users, and nobody ever visits your
  +      server.</li>
  +    </ol>
  +    <hr />
  +
  +    <h2>Script Alias'ed CGI</h2>
  +
  +    <p>Limiting <strong>CGI</strong> to special directories gives
  +    the admin control over what goes into those directories. This
  +    is inevitably more secure than non script aliased CGI, but
  +    <strong>only if users with write access to the directories are
  +    trusted</strong> or the admin is willing to test each new CGI
  +    script/program for potential security holes.</p>
  +
  +    <p>Most sites choose this option over the non script aliased
  +    CGI approach.</p>
  +    <hr />
  +
  +    <h2>CGI in general</h2>
  +
  +    <p>Always remember that you must trust the writers of the CGI
  +    script/programs or your ability to spot potential security
  +    holes in CGI, whether they were deliberate or accidental.</p>
  +
  +    <p>All the CGI scripts will run as the same user, so they have
  +    potential to conflict (accidentally or deliberately) with other
  +    scripts <em>e.g.</em> User A hates User B, so he writes a
  +    script to trash User B's CGI database. One program which can be
  +    used to allow scripts to run as different users is <a
  +    href="../suexec.html">suEXEC</a> which is included with Apache
  +    as of 1.2 and is called from special hooks in the Apache server
  +    code. Another popular way of doing this is with <a
  +    href="http://wwwcgi.umr.edu/~cgiwrap/">CGIWrap</a>.</p>
  +    <hr />
  +
  +    <h2>Stopping users overriding system wide settings...</h2>
  +
  +    <p>To run a really tight ship, you'll want to stop users from
  +    setting up <code>.htaccess</code> files which can override
  +    security features you've configured. Here's one way to do
  +    it...</p>
  +
  +    <p>In the server configuration file, put</p>
  +
  +    <blockquote>
  +      <code>&lt;Directory /&gt;<br />
  +       AllowOverride None<br />
  +       Options None<br />
  +       Allow from all<br />
  +       &lt;/Directory&gt;<br />
  +      </code>
  +    </blockquote>
  +    Then setup for specific directories 
  +
  +    <p>This stops all overrides, Includes and accesses in all
  +    directories apart from those named.</p>
  +    <hr />
  +
  +    <h2>Protect server files by default</h2>
  +
  +    <p>One aspect of Apache which is occasionally misunderstood is
  +    the feature of default access. That is, unless you take steps
  +    to change it, if the server can find its way to a file through
  +    normal URL mapping rules, it can serve it to clients.</p>
  +
  +    <p>For instance, consider the following example:</p>
  +
  +    <ol>
  +      <li><samp># cd /; ln -s / public_html</samp></li>
  +
  +      <li>Accessing <samp>http://localhost/~root/</samp></li>
  +    </ol>
  +
  +    <p>This would allow clients to walk through the entire
  +    filesystem. To work around this, add the following block to
  +    your server's configuration:</p>
  +<pre>
    &lt;Directory /&gt;
        Order Deny,Allow
        Deny from all
    &lt;/Directory&gt;
  -</PRE>
  -<P>
  -This will forbid default access to filesystem locations.  Add
  -appropriate
  -<A
  - HREF="../mod/core.html#directory"
  -><SAMP>&lt;Directory&gt;</SAMP></A>
  -blocks to allow access only
  -in those areas you wish.  For example,
  -</P>
  -<PRE>
  +</pre>
  +
  +    <p>This will forbid default access to filesystem locations. Add
  +    appropriate <a
  +    href="../mod/core.html#directory"><samp>&lt;Directory&gt;</samp></a>
  +    blocks to allow access only in those areas you wish. For
  +    example,</p>
  +<pre>
    &lt;Directory /usr/users/*/public_html&gt;
        Order Deny,Allow
        Allow from all
  @@ -186,46 +197,39 @@
        Order Deny,Allow
        Allow from all
    &lt;/Directory&gt;
  -</PRE>
  -<P>
  -Pay particular attention to the interactions of
  -<A
  - HREF="../mod/core.html#location"
  -><SAMP>&lt;Location&gt;</SAMP></A>
  -and
  -<A
  - HREF="../mod/core.html#directory"
  -><SAMP>&lt;Directory&gt;</SAMP></A>
  -directives; for instance, even if <SAMP>&lt;Directory /&gt;</SAMP>
  -denies access, a <SAMP>&lt;Location /&gt;</SAMP> directive might
  -overturn it.
  -</P>
  -<P>
  -Also be wary of playing games with the
  -<A
  - HREF="../mod/mod_userdir.html#userdir"
  ->UserDir</A>
  -directive; setting it to something like <SAMP>&quot;./&quot;</SAMP>
  -would have the same effect, for root, as the first example above.
  -If you are using Apache 1.3 or above, we strongly recommend that you
  -include the following line in your server configuration files:
  -</P>
  -<DL>
  - <DD><SAMP>UserDir&nbsp;disabled&nbsp;root</SAMP>
  - </DD>
  -</DL>
  -
  -<HR>
  -<P>Please send any other useful security tips to The Apache Group
  -by filling out a
  -<A HREF="http://www.apache.org/bug_report.html">problem report</A>.  
  -If you are confident you have found a security bug in the Apache
  -source code itself, <A
  -HREF="http://www.apache.org/security_report.html">please let us
  -know</A>.
  -
  -<P>
  -
  -<!--#include virtual="footer.html" -->
  -</BODY>
  -</HTML>
  +</pre>
  +
  +    <p>Pay particular attention to the interactions of <a
  +    href="../mod/core.html#location"><samp>&lt;Location&gt;</samp></a>
  +    and <a
  +    href="../mod/core.html#directory"><samp>&lt;Directory&gt;</samp></a>
  +    directives; for instance, even if <samp>&lt;Directory
  +    /&gt;</samp> denies access, a <samp>&lt;Location /&gt;</samp>
  +    directive might overturn it.</p>
  +
  +    <p>Also be wary of playing games with the <a
  +    href="../mod/mod_userdir.html#userdir">UserDir</a> directive;
  +    setting it to something like <samp>"./"</samp> would have the
  +    same effect, for root, as the first example above. If you are
  +    using Apache 1.3 or above, we strongly recommend that you
  +    include the following line in your server configuration
  +    files:</p>
  +
  +    <dl>
  +      <dd><samp>UserDir&nbsp;disabled&nbsp;root</samp></dd>
  +    </dl>
  +    <hr />
  +
  +    <p>Please send any other useful security tips to The Apache
  +    Group by filling out a <a
  +    href="http://www.apache.org/bug_report.html">problem
  +    report</a>. If you are confident you have found a security bug
  +    in the Apache source code itself, <a
  +    href="http://www.apache.org/security_report.html">please let us
  +    know</a>.</p>
  +
  +    <p><!--#include virtual="footer.html" -->
  +    </p>
  +  </body>
  +</html>
  +
  
  
  
  1.7       +211 -207  httpd-2.0/docs/manual/misc/tutorials.html
  
  Index: tutorials.html
  ===================================================================
  RCS file: /home/cvs/httpd-2.0/docs/manual/misc/tutorials.html,v
  retrieving revision 1.6
  retrieving revision 1.7
  diff -u -r1.6 -r1.7
  --- tutorials.html	2001/01/28 01:02:54	1.6
  +++ tutorials.html	2001/09/22 19:33:40	1.7
  @@ -1,209 +1,213 @@
  -<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
  -<HTML>
  -<HEAD>
  -<TITLE>Apache Tutorials</TITLE>
  -</HEAD>
  +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  +    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
   
  -<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  -<BODY
  - BGCOLOR="#FFFFFF"
  - TEXT="#000000"
  - LINK="#0000FF"
  - VLINK="#000080"
  - ALINK="#FF0000"
  ->
  -<!--#include virtual="header.html" -->
  +<html xmlns="http://www.w3.org/1999/xhtml">
  +  <head>
  +    <meta name="generator" content="HTML Tidy, see www.w3.org" />
  +
  +    <title>Apache Tutorials</title>
  +  </head>
  +  <!-- Background white, links blue (unvisited), navy (visited), red (active) -->
  +
  +  <body bgcolor="#FFFFFF" text="#000000" link="#0000FF"
  +  vlink="#000080" alink="#FF0000">
  +    <!--#include virtual="header.html" -->
  +
  +    <blockquote>
  +      <strong>Warning:</strong> This document has not been updated
  +      to take into account changes made in the 2.0 version of the
  +      Apache HTTP Server. Some of the information may still be
  +      relevant, but please use it with care.
  +    </blockquote>
  +
  +    <h1 align="CENTER">Apache Tutorials</h1>
  +
  +    <p>The following documents give you step-by-step instructions
  +    on how to accomplish common tasks with the Apache http server.
  +    Many of these documents are located at external sites and are
  +    not the work of the Apache Software Foundation. Copyright to
  +    documents on external sites is owned by the authors or their
  +    assignees. Please consult the <a href="../">official Apache
  +    Server documentation</a> to verify what you read on external
  +    sites.</p>
  +
  +    <h2>Installation &amp; Getting Started</h2>
  +
  +    <ul>
  +      <li><a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-06-1-001-01-NW-DP-LF">
  +      Getting Started with Apache 1.3</a> (ApacheToday)</li>
  +
  +      <li><a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-07-10-001-01-NW-LF-SW">
  +      Configuring Your Apache Server Installation</a>
  +      (ApacheToday)</li>
  +
  +      <li><a
  +      href="http://oreilly.apacheweek.com/pub/a/apache/2000/02/24/installing_apache.html">
  +      Getting, Installing, and Running Apache (on Unix)</a>
  +      (O'Reilly Network Apache DevCenter)</li>
  +
  +      <li><a
  +      href="http://www.builder.com/Servers/Apache/ss01.html">Maximum
  +      Apache: Getting Started</a> (CNET Builder.com)</li>
  +
  +      <li><a
  +      href="http://www.devshed.com/Server_Side/Administration/APACHE/">
  +      How to Build the Apache of Your Dreams</a> (Developer
  +      Shed)</li>
  +    </ul>
  +
  +    <h2>Basic Configuration</h2>
  +
  +    <ul>
  +      <li><a
  +      href="http://oreilly.apacheweek.com/pub/a/apache/2000/03/02/configuring_apache.html">
  +      An Amble Through Apache Configuration</a> (O'Reilly Network
  +      Apache DevCenter)</li>
  +
  +      <li><a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-07-19-002-01-NW-LF-SW">
  +      Using .htaccess Files with Apache</a> (ApacheToday)</li>
  +
  +      <li><a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-07-17-001-01-PS">
  +      Setting Up Virtual Hosts</a> (ApacheToday)</li>
  +
  +      <li><a
  +      href="http://www.builder.com/Servers/Apache/ss02.html">Maximum
  +      Apache: Configure Apache</a> (CNET Builder.com)</li>
  +
  +      <li>Getting More Out of Apache <a
  +      href="http://www.devshed.com/Server_Side/Administration/MoreApache/">
  +      Part 1</a> - <a
  +      href="http://www.devshed.com/Server_Side/Administration/MoreApache2/">
  +      Part 2</a> (Developer Shed)</li>
  +    </ul>
  +
  +    <h2>Security</h2>
  +
  +    <ul>
  +      <li><a
  +      href="http://www.linuxplanet.com/linuxplanet/tutorials/1527/1/">
  +      Security and Apache: An Essential Primer</a>
  +      (LinuxPlanet)</li>
  +
  +      <li><a
  +      href="http://www.apacheweek.com/features/userauth">Using User
  +      Authentication</a> (Apacheweek)</li>
  +
  +      <li><a href="http://www.apacheweek.com/features/dbmauth">DBM
  +      User Authentication</a> (Apacheweek)</li>
  +
  +      <li><a
  +      href="http://linux.com/security/newsitem.phtml?sid=12&amp;aid=3549">
  +      An Introduction to Securing Apache</a> (Linux.com)</li>
  +
  +      <li><a
  +      href="http://linux.com/security/newsitem.phtml?sid=12&amp;aid=3667">
  +      Securing Apache - Access Control</a> (Linux.com)</li>
  +
  +      <li>Apache Authentication <a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-07-24-002-01-NW-LF-SW">
  +      Part 1</a> - <a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-07-31-001-01-NW-DP-LF">
  +      Part 2</a> - <a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-08-07-001-01-NW-LF-SW">
  +      Part 3</a> - <a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-08-14-001-01-NW-LF-SW">
  +      Part 4</a> (ApacheToday)</li>
  +
  +      <li><a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-11-13-003-01-SC-LF-SW">
  +      mod_access: Restricting Access by Host</a> (ApacheToday)</li>
  +    </ul>
  +
  +    <h2>Logging</h2>
  +
  +    <ul>
  +      <li><a
  +      href="http://oreilly.apacheweek.com/pub/a/apache/2000/03/10/log_rhythms.html">
  +      Log Rhythms</a> (O'Reilly Network Apache DevCenter)</li>
  +
  +      <li><a
  +      href="http://www.apacheweek.com/features/logfiles">Gathering
  +      Visitor Information: Customising Your Logfiles</a>
  +      (Apacheweek)</li>
  +
  +      <li>Apache Guide: Logging <a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-08-21-003-01-NW-LF-SW">
  +      Part 1</a> - <a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-08-28-001-01-NW-LF-SW">
  +      Part 2</a> - <a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-09-05-001-01-NW-LF-SW">
  +      Part 3</a> - <a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-09-18-003-01-NW-LF-SW">
  +      Part 4</a> - <a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-09-25-001-01-NW-LF-SW">
  +      Part 5</a> (ApacheToday)</li>
  +    </ul>
  +
  +    <h2>CGI and SSI</h2>
  +
  +    <ul>
  +      <li><a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-06-05-001-10-NW-LF-SW">
  +      Dynamic Content with CGI</a> (ApacheToday)</li>
  +
  +      <li><a
  +      href="http://www.perl.com/CPAN-local/doc/FAQs/cgi/idiots-guide.html">
  +      The Idiot's Guide to Solving Perl CGI Problems</a>
  +      (CPAN)</li>
  +
  +      <li><a
  +      href="http://www.linuxplanet.com/linuxplanet/tutorials/1445/1/">
  +      Executing CGI Scripts as Other Users</a> (LinuxPlanet)</li>
  +
  +      <li><a href="http://www.htmlhelp.org/faq/cgifaq.html">CGI
  +      Programming FAQ</a> (Web Design Group)</li>
  +
  +      <li>Introduction to Server Side Includes <a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-06-12-001-01-PS">
  +      Part 1</a> - <a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-06-19-002-01-NW-LF-SW">
  +      Part 2</a> (ApacheToday)</li>
  +
  +      <li><a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-06-26-001-01-NW-LF-SW">
  +      Advanced SSI Techniques</a> (ApacheToday)</li>
  +
  +      <li><a
  +      href="http://www.builder.com/Servers/ApacheFiles/082400/">Setting
  +      up CGI and SSI with Apache</a> (CNET Builder.com)</li>
  +    </ul>
  +
  +    <h2>Other Features</h2>
  +
  +    <ul>
  +      <li><a
  +      href="http://www.apacheweek.com/features/negotiation">Content
  +      Negotiation Explained</a> (Apacheweek)</li>
  +
  +      <li><a
  +      href="http://www.apacheweek.com/features/imagemaps">Using
  +      Apache Imagemaps</a> (Apacheweek)</li>
  +
  +      <li><a
  +      href="http://apachetoday.com/news_story.php3?ltsn=2000-06-14-002-01-PS">
  +      Keeping Your Images from Adorning Other Sites</a>
  +      (ApacheToday)</li>
  +
  +      <li><a
  +      href="http://ppewww.ph.gla.ac.uk/~flavell/www/lang-neg.html">Language
  +      Negotiation Notes</a> (Alan J. Flavell)</li>
  +    </ul>
  +
  +    <p>If you have a pointer to a an accurate and well-written
  +    tutorial not included here, please let us know by submitting it
  +    to the <a href="http://bugs.apache.org/">Apache Bug
  +    Database</a>. <!--#include virtual="footer.html" -->
  +    </p>
  +  </body>
  +</html>
   
  -<blockquote><strong>Warning:</strong>
  -This document has not been updated to take into account changes
  -made in the 2.0 version of the Apache HTTP Server.  Some of the
  -information may still be relevant, but please use it
  -with care.
  -</blockquote>
  -
  -
  -<H1 ALIGN="CENTER">Apache Tutorials</H1>
  -
  -<P>The following documents give you step-by-step instructions on how
  -to accomplish common tasks with the Apache http server.  Many of these
  -documents are located at external sites and are not the work of the
  -Apache Software Foundation.  Copyright to documents on external sites
  -is owned by the authors or their assignees.  Please consult the <A
  -HREF="../">official Apache Server documentation</A> to verify what you
  -read on external sites.
  -
  -
  -<H2>Installation & Getting Started</H2>
  -
  -<UL>
  -
  -<LI><A
  -HREF="http://apachetoday.com/news_story.php3?ltsn=2000-06-1-001-01-NW-DP-LF"
  ->Getting Started with Apache 1.3</A> (ApacheToday)
  -
  -<LI><A
  -HREF="http://apachetoday.com/news_story.php3?ltsn=2000-07-10-001-01-NW-LF-SW"
  ->Configuring Your Apache Server Installation</A> (ApacheToday)
  -
  -<LI><A
  -HREF="http://oreilly.apacheweek.com/pub/a/apache/2000/02/24/installing_apache.html"
  ->Getting, Installing, and Running Apache (on Unix)</A> (O'Reilly
  -Network Apache DevCenter)
  -
  -<LI><A HREF="http://www.builder.com/Servers/Apache/ss01.html">Maximum
  -Apache: Getting Started</A> (CNET Builder.com)
  -
  -<LI><A HREF="http://www.devshed.com/Server_Side/Administration/APACHE/"
  ->How to Build the Apache of Your Dreams</A> (Developer Shed)
  -
  -</UL>
  -
  -
  -<H2>Basic Configuration</H2>
  -
  -<UL>
  -
  -<LI><A
  -HREF="http://oreilly.apacheweek.com/pub/a/apache/2000/03/02/configuring_apache.html"
  ->An Amble Through Apache Configuration</A> (O'Reilly Network Apache
  -DevCenter)
  -
  -<LI><A
  -HREF="http://apachetoday.com/news_story.php3?ltsn=2000-07-19-002-01-NW-LF-SW"
  ->Using .htaccess Files with Apache</A> (ApacheToday)
  -
  -<LI><A
  -HREF="http://apachetoday.com/news_story.php3?ltsn=2000-07-17-001-01-PS"
  ->Setting Up Virtual Hosts</A> (ApacheToday)
  -
  -<LI><A HREF="http://www.builder.com/Servers/Apache/ss02.html">Maximum
  -Apache: Configure Apache</A> (CNET Builder.com)
  -
  -<LI>Getting More Out of Apache <A HREF="http://www.devshed.com/Server_Side/Administration/MoreApache/">Part 1</A> - <A HREF="http://www.devshed.com/Server_Side/Administration/MoreApache2/">Part 2</A> (Developer Shed)
  -
  -</UL>
  -
  -<H2>Security</H2>
  -
  -<UL>
  -
  -<LI><A
  -HREF="http://www.linuxplanet.com/linuxplanet/tutorials/1527/1/">Security
  -and Apache: An Essential Primer</A> (LinuxPlanet)
  -
  -<LI><A HREF="http://www.apacheweek.com/features/userauth">Using User
  -Authentication</A> (Apacheweek)
  -
  -<LI><A HREF="http://www.apacheweek.com/features/dbmauth">DBM User
  -Authentication</A> (Apacheweek)
  -
  -<LI><A
  -HREF="http://linux.com/security/newsitem.phtml?sid=12&aid=3549">An
  -Introduction to Securing Apache</A> (Linux.com)
  -
  -<LI><A
  -HREF="http://linux.com/security/newsitem.phtml?sid=12&aid=3667">Securing
  -Apache - Access Control</A> (Linux.com)
  -
  -<LI>Apache Authentication <A
  -HREF="http://apachetoday.com/news_story.php3?ltsn=2000-07-24-002-01-NW-LF-SW"
  ->Part 1</A> - <A
  -HREF="http://apachetoday.com/news_story.php3?ltsn=2000-07-31-001-01-NW-DP-LF"
  ->Part 2</A> - <A
  -HREF="http://apachetoday.com/news_story.php3?ltsn=2000-08-07-001-01-NW-LF-SW"
  ->Part 3</A> - <A
  -HREF="http://apachetoday.com/news_story.php3?ltsn=2000-08-14-001-01-NW-LF-SW"
  ->Part 4</A> (ApacheToday)
  -
  -<LI><a href="http://apachetoday.com/news_story.php3?ltsn=2000-11-13-003-01-SC-LF-SW"
  ->mod_access: Restricting Access by Host</a> (ApacheToday)
  -
  -</UL>
  -
  -<H2>Logging</H2>
  -
  -<UL>
  -
  -<LI><A
  -HREF="http://oreilly.apacheweek.com/pub/a/apache/2000/03/10/log_rhythms.html"
  ->Log Rhythms</A> (O'Reilly Network Apache DevCenter)
  -
  -<LI><A HREF="http://www.apacheweek.com/features/logfiles">Gathering
  -Visitor Information: Customising Your Logfiles</A> (Apacheweek)
  -
  -<LI>Apache Guide: Logging
  -<A HREF="http://apachetoday.com/news_story.php3?ltsn=2000-08-21-003-01-NW-LF-SW"
  ->Part 1</A> -
  -<A HREF="http://apachetoday.com/news_story.php3?ltsn=2000-08-28-001-01-NW-LF-SW"
  ->Part 2</A> - 
  -<A HREF="http://apachetoday.com/news_story.php3?ltsn=2000-09-05-001-01-NW-LF-SW"
  ->Part 3</A> -
  -<A HREF="http://apachetoday.com/news_story.php3?ltsn=2000-09-18-003-01-NW-LF-SW"
  ->Part 4</A> -
  -<A HREF="http://apachetoday.com/news_story.php3?ltsn=2000-09-25-001-01-NW-LF-SW"
  ->Part 5</A> (ApacheToday)
  -
  -</UL>
  -
  -<H2>CGI and SSI</H2>
  -
  -<UL>
  -
  -<LI><A
  -HREF="http://apachetoday.com/news_story.php3?ltsn=2000-06-05-001-10-NW-LF-SW"
  ->Dynamic Content with CGI</A> (ApacheToday)
  -
  -<LI><A
  -HREF="http://www.perl.com/CPAN-local/doc/FAQs/cgi/idiots-guide.html">The
  -Idiot's Guide to Solving Perl CGI Problems</A> (CPAN)
  -
  -<LI><A
  -HREF="http://www.linuxplanet.com/linuxplanet/tutorials/1445/1/">Executing
  -CGI Scripts as Other Users</A> (LinuxPlanet)
  -
  -<LI><A HREF="http://www.htmlhelp.org/faq/cgifaq.html">CGI Programming
  -FAQ</A> (Web Design Group)
  -
  -<LI>Introduction to Server Side Includes <A
  -HREF="http://apachetoday.com/news_story.php3?ltsn=2000-06-12-001-01-PS">Part
  -1</A> - <A
  -HREF="http://apachetoday.com/news_story.php3?ltsn=2000-06-19-002-01-NW-LF-SW"
  ->Part 2</A> (ApacheToday)
  -
  -<LI><A
  -HREF="http://apachetoday.com/news_story.php3?ltsn=2000-06-26-001-01-NW-LF-SW"
  ->Advanced SSI Techniques</A> (ApacheToday)
  -
  -<LI><A
  -HREF="http://www.builder.com/Servers/ApacheFiles/082400/">Setting up
  -CGI and SSI with Apache</A> (CNET Builder.com)
  -
  -</UL>
  -
  -<H2>Other Features</H2>
  -
  -<UL>
  -
  -<LI><A HREF="http://www.apacheweek.com/features/negotiation">Content
  -Negotiation Explained</A> (Apacheweek)
  -
  -<LI><A HREF="http://www.apacheweek.com/features/imagemaps">Using
  -Apache Imagemaps</A> (Apacheweek)
  -
  -<LI><A
  -HREF="http://apachetoday.com/news_story.php3?ltsn=2000-06-14-002-01-PS"
  ->Keeping Your Images from Adorning Other Sites</A> (ApacheToday)
  -
  -<LI><A HREF="http://ppewww.ph.gla.ac.uk/~flavell/www/lang-neg.html"
  ->Language Negotiation Notes</A> (Alan J. Flavell)
  -
  -</UL>
  -
  -
  -<P>If you have a pointer to a an accurate and well-written tutorial
  -not included here, please let us know by submitting it to the
  -<A HREF="http://bugs.apache.org/">Apache Bug Database</A>.
  -
  -<!--#include virtual="footer.html" -->
  -</BODY>
  -</HTML>
  
  
  

Mime
View raw message