httpd-cvs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject cvs commit: httpd-2.0/docs/manual/mod mod_proxy.xml
Date Sat, 09 Mar 2002 22:23:30 GMT
slive       02/03/09 14:23:30

  Added:       docs/manual/mod mod_proxy.xml
  XML for mod_proxy.
  Note: This one was and still is is pretty rough shape, both in terms of
  1. Stinky markup (why use a <p> when a couple <br>s will do?)
  2. Out of date and insufficient content.
  Revision  Changes    Path
  1.1                  httpd-2.0/docs/manual/mod/mod_proxy.xml
  Index: mod_proxy.xml
  <?xml version="1.0"?>
  <!DOCTYPE modulesynopsis SYSTEM "../style/modulesynopsis.dtd">
  <?xml-stylesheet type="text/xsl" href="../style/manual.xsl"?>
  <description>HTTP/1.1 proxy/gateway server</description>
  <note type="warning"><title>Warning</title>
  This document has been updated to take into account changes
  made in the 2.0 version of the Apache HTTP Server.  Some of the
  information may still be inaccurate, please use it
  with care.
  <p>This module implements a proxy/gateway for Apache. It implements
  proxying capability for
  <code>CONNECT</code> (for SSL),
  <code>HTTP/1.0</code>, and
  The module can be configured to connect to other proxy modules for these
  and other protocols.</p>
  <p>This module was experimental in Apache 1.1.x. Improvements and bugfixes
  were made in Apache v1.2.x and Apache v1.3.x, then the module underwent a major
  overhaul for Apache v2.0. The protocol support was upgraded to HTTP/1.1,
  and filter support was enabled.</p>
  <p>Please note that the <strong>caching</strong> function present in
  mod_proxy up to Apache v1.3.x has been <strong>removed</strong> from
  mod_proxy and will be incorporated into a new module, mod_cache.</p>
  <section id="configs"><title>Common configuration topics</title>
  <li><a href="#forwardreverse">Forward and Reverse Proxies</a></li>
  <li><a href="#access">Controlling access to your proxy</a></li>
  <li><a href="#shortname">Using Netscape hostname shortcuts</a></li>
  <li><a href="#mimetypes">Why doesn't file type <em>xxx</em> download
via FTP?</a></li>
  <li><a href="#type">How can I force an FTP ASCII download of File <em>xxx</em>?</a></li>
  <li><a href="#percent2fhack">How can I access FTP files outside of my home directory?</a></li>
  <li><a href="#ftppass">How can I hide the FTP cleartext password in my browser's
URL line?</a></li>
  <li><a href="#startup">Why does Apache start more slowly when using the
          proxy module?</a></li>
  <!--<li><a href="#socks">Can I use the Apache proxy module with my SOCKS
  <li><a href="#intranet">What other functions are useful for an intranet proxy
  <section id="forwardreverse"><title>Forward and Reverse Proxies</title>
  <p>Apache can be configured in both a <em>forward</em> and <em>reverse</em>
  proxy configuration.</p>
  <p>A <em>forward proxy</em> is an intermediate system that enables a browser
to connect to a
  remote network to which it normally does not have access. A forward proxy
  can also be used to cache data, reducing load on the networks between the
  forward proxy and the remote webserver.</p>
  <p>Apache's mod_proxy can be figured to behave like a forward proxy
  using the <directive module="mod_proxy">ProxyRemote</directive>
  directive. In addition, caching of data can be achieved by configuring
  Apache <module>mod_cache</module>. Other dedicated forward proxy
  packages include <a href="">Squid</a>.</p>
  <p>A <em>reverse proxy</em> is a webserver system that is capable of serving
  sourced from other webservers - in addition to webpages on disk or generated
  dynamically by CGI - making these pages look like they originated at the
  reverse proxy.</p>
  <p>When configured with the mod_cache module the reverse
  proxy can act as a cache for slower backend webservers. The reverse proxy
  can also enable advanced URL strategies and management techniques, allowing
  webpages served using different webserver systems or architectures to
  coexist inside the same URL space. Reverse proxy systems are also ideal for
  implementing centralised logging websites with many or diverse website
  backends. Complex multi-tier webserver systems can be constructed using an
  Apache mod_proxy frontend and any number of backend webservers.</p>
  <p>The reverse proxy is configured using the
  <directive module="mod_proxy">ProxyPass</directive> and <directive
  module="mod_proxy">ProxyPassReverse</directive> directives. Caching can be
  enabled using mod_cache as with the forward proxy.</p>
  <section id="access"><title>Controlling access to your proxy</title>
  <p>You can control who can access your proxy via the normal <directive module="core"
  control block using the following example:</p>
  &lt;Directory proxy:*&gt;<br />
  Order Deny,Allow<br />
  Deny from all<br />
  Allow from 192.168.0<br />
  <p>A <directive module="core" type="section">Files</directive> block
  will also work, and is the only method known to work for all possible
  URLs in Apache versions earlier than 1.2b10.</p>
  <p>When configuring a reverse proxy, access control takes on the
  attributes of the normal server <directive module="core"
  type="section">directory</directive> configuration.</p>
  <!--<h2><a name="shortname">Using Netscape hostname shortcuts</a></h2>
  There is an optional patch to the proxy module to allow Netscape-like
  hostname shortcuts to be used. It's available from the
  <a href=""
  ><code>contrib/patches/1.2</code></a> directory on the Apache Web
  <section id="mimetypes"><title>Why doesn't file type <em>xxx</em>
  download via FTP?</title>
  <p>You probably don't have that particular file type defined as
  <em>application/octet-stream</em> in your proxy's mime.types configuration
  file. A useful line can be</p>
  application/octet-stream        bin dms lha lzh exe class tgz taz
  <section id="type"><title>How can I force an FTP ASCII download of
  File <em>xxx</em>?</title>
  <p>In the rare situation where you must download a specific file using the FTP
  <strong>ASCII</strong> transfer method (while the default transfer is in
  <strong>binary</strong> mode), you can override mod_proxy's default by
  suffixing the request with <code>;type=a</code> to force an ASCII transfer.
  (FTP Directory listings are always executed in ASCII mode, however.)</p>
  <section id="percent2fhck"><title>How can I access FTP files outside
  of my home directory?</title>
  An FTP URI is interpreted relative to the home directory of the user
  who is logging in. Alas, to reach higher directory levels you cannot
  use /../, as the dots are interpreted by the browser and not actually
  sent to the FTP server. To address this problem, the so called "Squid
  %2f hack" was implemented in the Apache FTP proxy; it is is a solution
  which is also used by other popular proxy servers like the <a
  href="">Squid Proxy Cache</a>.  By
  prepending /%2f to the path of your request, you can make such a proxy
  change the FTP starting directory to / (instead of the home
  directory). </p> 
  <p><strong>Example:</strong> To retrieve the file
  <code>/etc/motd</code>, you would use the URL</p>
  <section id="ftppass"><title>How can I hide the FTP cleartext password
  in my browser's URL line?</title>
  To log in to an FTP server by username and password, Apache
  uses different strategies.
  In absense of a user name and password in the URL altogether,
  Apache sends an anomymous login to the FTP server, i.e.,</p>
  user: anonymous<br />
  password: apache_proxy@
  <p>This works for all popular FTP servers which are configured for
  anonymous access.</p>
  <p>For a personal login with a specific username, you can embed
  the user name into the URL, like in:
  <code>ftp://<em>username@host</em>/myfile</code>. If the FTP server
  asks for a password when given this username (which it should),
  then Apache will reply with a [401 Authorization required] response,
  which causes the Browser to pop up the username/password dialog.
  Upon entering the password, the connection attempt is retried,
  and if successful, the requested resource is presented.
  The advantage of this procedure is that your browser does not
  display the password in cleartext (which it would if you had used
  <code>ftp://<em>username:password@host</em>/myfile</code> in
  the first place).</p>
  The password which is transmitted in such a way
  is not encrypted on its way. It travels between your browser and
  the Apache proxy server in a base64-encoded cleartext string, and
  between the Apache proxy and the FTP server as plaintext. You should
  therefore think twice before accessing your FTP server via HTTP
  (or before accessing your personal files via FTP at all!) When
  using unsecure channels, an eavesdropper might intercept your
  password on its way.
  <section id="startup"><title>Why does Apache start more slowly when
  using the proxy module?</title>
  <p>If you're using the <directive module="mod_proxy">ProxyBlock</directive>
  directive, hostnames' IP addresses are looked up and cached during
  startup for later match test. This may take a few seconds (or more)
  depending on the speed with which the hostname lookups occur.</p>
  <!--<h2><a name="socks">Can I use the Apache proxy module with my SOCKS proxy?</a></h2>
  Yes. Just build Apache with the rule <code>SOCKS4=yes</code> in your
  <em>Configuration</em> file, and follow the instructions there. SOCKS5
  capability can be added in a similar way (there's no <code>SOCKS5</code>
  rule yet), so use the <code>EXTRA_LDFLAGS</code> definition, or build Apache
  normally and run it with the <em>runsocks</em> wrapper provided with SOCKS5,
  if your OS supports dynamically linked libraries.<p>
  Some users have reported problems when using SOCKS version 4.2 on Solaris.
  The problem was solved by upgrading to SOCKS 4.3.<p>
  Remember that you'll also have to grant access to your Apache proxy machine by
  permitting connections on the appropriate ports in your SOCKS daemon's
  <section id="intranet"><title>What other functions are useful for an
  intranet proxy server?</title>
  <p>An Apache proxy server situated in an intranet needs to forward
  external requests through the company's firewall. However, when it has
  to access resources within the intranet, it can bypass the firewall
  when accessing hosts. The <directive
  module="mod_proxy">NoProxy</directive> directive is useful for
  specifying which hosts belong to the intranet and should be accessed
  <p>Users within an intranet tend to omit the local domain name from their
  WWW requests, thus requesting "http://somehost/" instead of
  "". Some commercial proxy servers let them get
  away with this and simply serve the request, implying a configured
  local domain. When the <directive module="mod_proxy">ProxyDomain</directive>
  is used and the server is <a href="#proxyrequests">configured for
  proxy service</a>, Apache can return a redirect response and send the client
  to the correct, fully qualified, server address. This is the preferred method
  since the user's bookmark files will then contain fully qualified hosts.</p>
  <syntax>ProxyPreserveHost on|off</syntax>
  <default>ProxyPreserveHost Off</default>
  <contextlist><context>server config</context>
  <context>virtual host</context>
  <compatibility>Available in
  Apache 2.0.31 and later.</compatibility>
  <p>When enabled, this option will pass the Host: line from the
  incoming request to the proxied host, instead of the hostname
  specified in the proxypass line.
  <p>This option should normally be turned 'off'.</p>
  <syntax>ProxyRequests on|off</syntax>
  <default>ProxyRequests Off</default>
  <contextlist><context>server config</context>
  <context>virtual host</context>
  <p>This allows or prevents Apache from functioning as a forward proxy
  server. (Setting ProxyRequests to 'off' does not disable use of the 
  <directive module="mod_proxy">ProxyPass</directive> directive.)</p>
  <p>In a typical reverse proxy configuration, this option should be set to
  <syntax>ProxyRemote <em>match remote-server</em></syntax>
  <contextlist><context>server config</context>
  <context>virtual host</context>
  <p>This defines remote proxies to this proxy. <em>match</em> is either
  name of a URL-scheme that the remote server supports, or a partial URL
  for which the remote server should be used, or '*' to indicate the
  server should be contacted for all requests. <em>remote-server</em> is a
  partial URL for the remote server. Syntax:</p>
    remote-server = protocol://hostname[:port]
  <p><em>protocol</em> is the protocol that should be used to communicate
  with the remote server; only "http" is supported by this module.</p>
    ProxyRemote<br />
    ProxyRemote *<br />
    ProxyRemote ftp
  <p>In the last example, the proxy will forward FTP requests, encapsulated
  as yet another HTTP proxy request, to another proxy which can handle
  <p>This option also supports reverse proxy configuration - a backend
  webserver can be embedded within a virtualhost URL space even if that
  server is hidden by another forward proxy.</p>
  <syntax>ProxyPass [<em>path</em>] !|<em>url</em></syntax>
  <contextlist><context>server config</context>
  <context>virtual host</context>
  <!-- XXX: Need to document that the path is not used when placed in
  a location section -->
  <p>This directive allows remote servers to be mapped into the space of
  the local server; the local server does not act as a proxy in the
  conventional sense, but appears to be a mirror of the remote
  server. <em>path</em> is the name of a local virtual path;
  <em>url</em> is a partial URL for the remote server.</p>
  <p>Suppose the local server has address <code></code>;

     ProxyPass /mirror/foo/
  <p>will cause a local request for the
  &lt;<code></code>&gt; to be
  internally converted into a proxy request to
  The ! directive is useful in situations where you don't want to reverse-proxy
  a subdirectory. eg.</p>
          ProxyPass /mirror/foo/i !<br />
          ProxyPass /mirror/foo
  <p>will proxy all requests to /mirror/foo to EXCEPT requests made to /mirror/foo/i</p>
  <note>NB: order is important. you need to put the exclusions BEFORE the general proxypass
  <syntax>ProxyPassReverse [<em>path</em>] <em>url</em></syntax>
  <contextlist><context>server config</context>
  <context>virtual host</context>
  <!-- XXX: Need to document that the path is not used when placed in
  a location section -->
  <p>This directive lets Apache adjust the URL in the <code>Location</code>,
  <code>Content-Location</code> and <code>URI</code> headers on
  HTTP redirect responses. This is essential when Apache is used as
  a reverse proxy to avoid by-passing the reverse proxy because of HTTP
  redirects on the backend servers which stay behind the reverse proxy.</p>
  <p><em>path</em> is the name of a local virtual path.<br />
  <em>url</em> is a partial URL for the remote server - the same way they are
  used for the <directive module="mod_proxy">ProxyPass</directive> directive.</p>
  Example:<br />
  Suppose the local server has address <code></code>; then</p>
     ProxyPass         /mirror/foo/<br />
     ProxyPassReverse  /mirror/foo/
  <p>will not only cause a local request for the
  &lt;<code></code>&gt; to be internally
  converted into a proxy request to &lt;<code></code>&gt;
  functionality <code>ProxyPass</code> provides here). It also takes care of
  redirects the server sends: when <code></code> is
  redirected by him to <code></code> Apache adjusts this to
  <code></code> before forwarding the HTTP
  redirect response to the client. </p>
  Note that this <directive>ProxyPassReverse</directive> directive can
  also be used in conjunction with the proxy pass-through feature
  ("<code>RewriteRule ...  [P]</code>") from
  <module>mod_rewrite</module> because its doesn't depend on a
  corresponding <directive module="mod_proxy">ProxyPass</directive>
  <syntax>AllowCONNECT <em>port</em> [<em>port</em>] ...</syntax>
  <default>AllowCONNECT 443 563</default>
  <contextlist><context>server config</context>
  <context>virtual host</context>
  <p>The <directive>AllowCONNECT</directive> directive specifies a list
  of port numbers to which the proxy <code>CONNECT</code> method may
  connect.  Today's browsers use this method when a <em>https</em>
  connection is requested and proxy tunneling over <em>http</em> is in
  effect.<br /> By default, only the default https port (443) and the
  default snews port (563) are enabled. Use the
  <directive>AllowCONNECT</directive> directive to overrride this default and
  allow connections to the listed ports only.</p>
  <syntax>ProxyBlock *|<em>word|host|domain</em>
  [<em>word|host|domain</em>] ...</syntax>
  <contextlist><context>server config</context>
  <context>virtual host</context>
  <p>The <directive>ProxyBlock</directive> directive specifies a list of
  words, hosts and/or domains, separated by spaces.  HTTP, HTTPS, and
  FTP document requests to sites whose names contain matched words,
  hosts or domains are <em>blocked</em> by the proxy server. The proxy
  module will also attempt to determine IP addresses of list items which
  may be hostnames during startup, and cache them for match test as
  well. Example:</p>
  <p>'' would also be matched if referenced by IP
  <p>Note that 'wotsamattau' would also be sufficient to match
  <p>Note also that</p>
  ProxyBlock *
  <p>blocks connections to all sites.</p>
  <syntax>ProxyReceiveBufferSize <em>bytes</em></syntax>
  <contextlist><context>server config</context>
  <context>virtual host</context>
  <p>The <directive>ProxyReceiveBufferSize</directive> directive
  specifies an explicit network buffer size for outgoing HTTP and FTP
  connections, for increased throughput.  It has to be greater than 512
  or set to 0 to indicate that the system's default buffer size should
  be used.</p>
    ProxyReceiveBufferSize 2048
  <syntax>ProxyMaxForwards <em>number</em></syntax>
  <default>ProxyMaxForwards 10</default>
  <contextlist><context>server config</context>
  <context>virtual host</context>
  <compatibility>Available in Apache 2.0 and later</compatibility>
  <p>The <directive>ProxyMaxForwards</directive> directive specifies the
  maximum number of proxies through which a request may pass. This is
  set to prevent infinite proxy loops, or a DoS attack.</p>
    ProxyMaxForwards 10
   <em>Hostname</em>] ...</syntax>
  <contextlist><context>server config</context>
  <context>virtual host</context>
  <p>This directive is only useful for Apache proxy servers within
  intranets.  The <directive>NoProxy</directive> directive specifies a
  list of subnets, IP addresses, hosts and/or domains, separated by
  spaces. A request to a host which matches one or more of these is
  always served directly, without forwarding to the configured
  <directive module="mod_proxy">ProxyRemote</directive> proxy server(s).</p>
    ProxyRemote  *<br />
  <p>The arguments to the NoProxy directive are one of the following type list:</p>
      <!-- ===================== Domain ======================= -->
      <dt><a name="domain">
      <dd>A <em>Domain</em> is a partially qualified DNS domain name, preceded
          by a period.
          It represents a list of hosts which logically belong to the same DNS
          domain or zone (<em>i.e.</em>, the suffixes of the hostnames are all
ending in 
          <em>Domain</em>).<br />
  		Examples: <code>.com</code>   <code></code><br
          To distinguish <em>Domain</em>s from <a href="#hostname"><em>Hostname</em></a>s
          syntactically and semantically; a DNS domain can have a DNS A record,
          too!), <em>Domain</em>s are always written
          with a leading period.<br />
          Note: Domain name comparisons are done without regard to the case,
          and <em>Domain</em>s are always assumed to be anchored in the root 
          of the DNS tree, therefore two domains <code></code> and
          <code></code> (note the trailing period) are
          considered equal. Since a domain comparison does not involve a DNS
  	lookup, it is much more efficient than subnet comparison.</dd>
      <!-- ===================== SubNet ======================= -->
      <dt><a name="subnet">
      <dd>A <em>SubNet</em> is a partially qualified internet address in
          numeric (dotted quad) form, optionally followed by a slash and the
          netmask, specified as the number of significant bits in the
          <em>SubNet</em>. It is used to represent a subnet of hosts which can
          be reached over a common network interface. In the absence of the
          explicit net mask it is assumed that omitted (or zero valued)
          trailing digits specify the mask. (In this case, the netmask can
          only be multiples of 8 bits wide.)<br />
           <dt><code>192.168</code> or <code></code></dt>
           <dd>the subnet with an implied netmask of 16 valid bits
               (sometimes used in the netmask form <code></code>)</dd>
           <dd>the subnet <code></code> with a netmask of
               valid bits (also used in the form</dd>
  		As a degenerate case, a <em>SubNet</em> with 32 valid bits is the
          equivalent to an <em>IPAddr</em>, while a <em>SubNet</em>
with zero
          valid bits (<em>e.g.</em>, is the same as the constant
          <em>_Default_</em>, matching any IP address. </dd>
      <!-- ===================== IPAddr ======================= -->
      <dt><a name="ipaddr">
      <dd>A <em>IPAddr</em> represents a fully qualified internet address
          numeric (dotted quad) form. Usually, this address represents a
          host, but there need not necessarily be a DNS domain name
          connected with the address.<br />
  		Example:<br />
          Note: An <em>IPAddr</em> does not need to be resolved by the DNS
  	system, so it can result in more effective apache performance.</dd>
      <!-- ===================== Hostname ======================= -->
      <dt><a name="hostname">
      <dd>A <em>Hostname</em> is a fully qualified DNS domain name which
          be resolved to one or more <a
  	href="#ipaddr"><em>IPAddrs</em></a> via the DNS domain name service.

          It represents a logical host (in contrast to
  	<a href="#domain"><em>Domain</em></a>s, see 
          above) and must be resolvable to at least one <a
  	href="#ipaddr"><em>IPAddr</em></a> (or often to a list of hosts
  	with different <a href="#ipaddr"><em>IPAddr</em></a>'s).<br
  		Examples: <code></code>
                    <code></code><br />
          Note: In many situations, it is more effective to specify an
          <a href="#ipaddr"><em>IPAddr</em></a> in place of a
  	<em>Hostname</em> since a DNS lookup 
          can be avoided. Name resolution in Apache can take a remarkable deal
          of time when the connection to the name server uses a slow PPP
          link.<br />
          Note: <em>Hostname</em> comparisons are done without regard to the case,
          and <em>Hostname</em>s are always assumed to be anchored in the root
          of the DNS tree, therefore two hosts <code></code>
          and <code></code> (note the trailing period) are
          considered equal.</dd>
  <seealso><a href="../dns-caveats.html">DNS Issues</a></seealso>
  <syntax>ProxyTimeout <em>seconds</em></syntax>
  <default>ProxyTimeout 300</default>
  <contextlist><context>server config</context>
  <context>virtual host</context>
  <compatibility>Available in
  Apache 2.0.31 and later</compatibility>
  <p>This directive allows a user to specifiy a timeout on proxy requests.
  This is usefull when you have a slow/buggy appserver which hangs,
  and you would rather just return a timeout and fail gracefully instead
  of waiting however long it takes the server to return
  <syntax>ProxyDomain <em>Domain</em></syntax>
  <contextlist><context>server config</context>
  <context>virtual host</context>
  <p>This directive is only useful for Apache proxy servers within
  intranets.  The <directive>ProxyDomain</directive> directive specifies
  the default domain which the apache proxy server will belong to. If a
  request to a host without a domain name is encountered, a redirection
  response to the same host with the configured <em>Domain</em> appended
  will be generated.</p>
    ProxyRemote  *<br />
    NoProxy<br />
  <syntax>ProxyVia on|off|full|block</syntax>
  <default>ProxyVia off</default>
  <contextlist><context>server config</context>
  <context>virtual host</context>
  <p>This directive controls the use of the <code>Via:</code> HTTP
  header by the proxy. Its intended use is to control the flow of of
  proxy requests along a chain of proxy servers.  See RFC2068 (HTTP/1.1)
  for an explanation of <code>Via:</code> header lines.</p>
  <ul> <li>If set
  to <em>off</em>, which is the default, no special processing is
  performed. If a request or reply contains a <code>Via:</code> header,
  it is passed through unchanged.</li>
  <li>If set to <em>on</em>, each
  request and reply will get a <code>Via:</code> header line added for
  the current host.</li>
  <li>If set to <em>full</em>, each generated <code>Via:</code>
  line will additionally have the Apache server version shown as a
  <code>Via:</code> comment field.</li>
  <li>If set to <em>block</em>, every
  proxy request will have all its <code>Via:</code> header lines
  removed. No new <code>Via:</code> header will be generated.</li>
  <syntax>ProxyErrorOverride On|Off</syntax>
  <default>ProxyErrorOverride Off</default>
  <contextlist><context>server config</context>
  <context>virtual host</context>
  <compatibility>Available in version 2.0 and later</compatibility>
  <p>This directive is useful for reverse-proxy setups, where you want to 
  have a common look and feel on the error pages seen by the end user. 
  This also allows for included files (via mod_include's SSI) to get
  the error code and act accordingly (default behavior would display
  the error page of the proxied server, turning this on shows the SSI
  Error message).</p>

View raw message