httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From solprovi...@apache.org
Subject Re: [users@httpd] different kinds of proxies
Date Sun, 20 Jul 2008 03:29:23 GMT
On 7/19/08, André Warnier <aw@ice-sa.com> wrote:
>  From a recent thread originally dedicated to find out if a proxy server can
> be really "transparent", I'll first quote a summary from "solprovider".
>
>  quote
>
>  I think the confusion is between an network proxy server and a Web
>  "reverse" proxy server.
>
>  A network proxy server handles NAT (Network Address Translation).  A
>  company internally uses private IP addresses (e.g. 10.*.*.*).  All
>  Internet traffic from these internal addresses use a network proxy
>  server to reach the Internet.  The proxy server changes the
>  originating IP Addresses on the outbound packets from the internal
>  network IP address to the proxy's Internet IP address.  Responses from
>  the Internet server are received by the proxy server and changed again
>  to be sent to the originating computer on the internal network.  The
>  browser uses the Internet domain name so Cookies are not affected.
>
>  A Web "reverse" proxy server handles multiple software applications
>  appearing as a single server.  The applications can be found on
>  multiple ports on one server or on multiple hardware servers.  Visitor
>  traffic to several applications goes to one IP Address.  The Web
>  server at that IP Address decides where the request should be sent
>  distinguishing based on the server name (using Virtual Servers) or the
>  path (using Rewrites).  If the applications use Cookies, the
>  application Cookies must be rewritten by the Web proxy server because
>  the browsers use the server name of the Web proxy server, not the
>  application servers.
>  1. The browser requests http://myapp.example.com.
>  2. The Web proxy server myapp.example.com sends the request to
>  myInternalApplicationServer.example.org.
>  3. The myInternalApplicationServer.example.org sends a
> response with a
>  Cookie for myInternalApplicationServer.example.org to the
> Web proxy
>  server.
>  4. The Web proxy server changes the Cookie from
>  myInternalApplicationServer.example.org to
> myapp.example.com.
>  5. The browser receives the Cookie for myapp.example.com and send the
>  Cookie with future requests to the Web proxy server.
>  6. The Web proxy server sends the incoming Cookies with the request to
>  the application server as in #2.  (Depending on security, the incoming
>  Cookies may need to be changed to match the receiving server.)
>  7. GOTO #3.
>
>  Deciding the type of proxy server being used may be confusing.  An
>  Internet request for an internal server can be handled with either
>  type depending on the gateway server.
>  - Network proxy: The gateway uses firewall software for NAT -- all
>  requests for the internal server are sent to the internal server.  The
>  internal server sends Cookies using its Internet name.
>  - Web proxy: The gateway is a Web server.  Internal application
>  servers do not use Internet names so the gateway must translate URLs
>  and Cookies.
>
>  --
>  The specification in the OP was how to Web proxy requests:
>  1. Server receives request for
> http://www.example.com/amazon/...
>  2. Server passes request to http://www.amazon.com/...
>  3. Server translates response from amazon so the visitor receives
>  Cookies from .example.com.
>  4. Future requests are translated so the Web proxy server
>  (www.example.com) sends the requests including Cookies to amazon.com.
>
>  Read http://httpd.apache.org/docs/2.0/mod/mod_proxy.html
>  Read the sections applying to "reverse" proxies.  Ignore "forward"
>  proxying because that process is not transparent -- the client
>  computer must be configured to use a forward proxy.
>
>  I once had difficulty with ProxyPass and switched to using Rewrites so
>  I would handle this with something like:
>         RewriteEngine On
>         RewriteRule ^/amazon/(.*)$ http://www.amazon.com/$1 [P]
>         ProxyPassReverseCookieDomain amazon.com example.com
>         ProxyPassReverse /amazon/       http://www.amazon.com/
>  This should handle Cookies and handle removing/adding "/amazon" in the
> path.
>
>  We have not discussed changing links in pages from amazon.com to use
>  example.com.  This simple often-needed functionality has been ignored
>  by the Apache httpd project.  (This functionality was included in a
>  servlet I wrote in 1999.) Research "mod_proxy_html".
>
>  unquote
>
>  Now, I believe that there is still a third type of proxy, as follows :
>
>  When I configure my browser to use "ourproxy.ourdomain.com:8000" as the
> HTTP proxy for my browser, it means that independently of whatever NAT may
> be effected by an internal router that connects my internal network to the
> internet, something else is going on :
>  Whenever I type in my browser a URL like "http://www.amazon.com", my
> browser will not resolve "www.amazon.com" and send it a request like :
>  GET / HTTP/1.1
>  Host: www.amazon.com
>
>  Instead, my browser will send a request to "ourproxy.ourdomain.com:8000",
> as follows :
>  GET http://www.amazon.com HTTP/1.1
>  Host: www.amazon.com
>  ...
>
>  The server at "ourproxy.ourdomain.com:8000" will then look up in his page
> cache, to see if it already has this page from a previous access. Then it
> will either return this cached page, or retrieve the page anew from
> "www.amazon.com", cache it (maybe) and deliver the newly-fecthed page. (I am
> skipping a lot of details about freshness, no-cache etc..)
>
>  The main (original) question was however : what happens in this case to
> cookies possibly set by "www.amazon.com" ?
>
>  I personally imagine that such a proxy server (which I guess is the
> "forward" kind) caches only page contents, not the HTTP headers returned
> with each page, or am I wrong ?
>
>  And in any case, if a page was returned from "www.amazon.com" along with a
> "Set-Cookie" HTTP header, it should not be cache-able by the proxy server,
> or am I wrong again ?
>
>  And, if such a proxy retrieves a new page from an external server, and the
> page comes back with a "Set-Cookie" header, this cookie header is then
> passed unchanged to the original browser requester, isn't it ?
>
>  And the requesting browser should accept this cookie as originating from
> "www.amazon.com", even if technically this answer comes back from the proxy
> server, no ?
>  André

Yes, the third type of proxy exists and was mentioned in my quote in
your post.  Quoting myself,
   "Ignore "forward" proxying because that process is not transparent
-- the client computer must be configured to use a forward proxy."

I should mention that a forward proxy is considered extremely
dangerous. Anybody using the forward proxy is hiding their identity
from the destination websites.  An "anonymizer" (or "anonymous proxy")
is very useful when maliciously attacking Internet servers.  Every
page in Apache httpd's documentation about configuring a forward proxy
includes the warning:
"Do not enable proxying with ProxyRequests until you have secured your
server. Open proxy servers are dangerous both to your network and to
the Internet at large."

A forward proxy can be configured to cache information including
webpages and Cookies.  Assume someone is storing and reading your
information if you do not control the proxy server.

Cookie rewriting is not an issue.  The browser and forward proxy
server work together to handle everything.  I have not tested whether
the browser stores the Cookies as from the proxy or the originating
server (as I am unwilling to enable forward proxying on a production
server and do not currently have a test environment with multiple
networks), but it works.

Assume Cookies are in the cache because you have no guarantee that
anything is safe unless you control the proxy.  The owner of the
anonymizer can easily highjack your session.  Congratulations on
becoming Amazon's best customer with that multi-million dollar
purchase.

I am uncertain that forward proxies have any legitimate use today.
Network routing with NAT handles internal networks.  Reverse proxy
servers handle multiple backend applications appearing as one
front-end server.  Anonymizers protect identities from target servers
while increasing the risk your identity being stolen.  I am interested
in learning when using a forward proxy is the best solution for a
legal activity.

solprovider
Mime
View raw message