httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dean Gaudet <dgau...@arctic.org>
Subject Re: absoluteURIs suck
Date Sat, 21 Feb 1998 10:16:29 GMT
rant rant.

On Fri, 20 Feb 1998, Roy T. Fielding wrote:

> >I really don't understand the lameness regarding absoluteURIs in HTTP/1.1. 
> >Suppose HTTP/1.2 comes out and dictates that absoluteURIs must be used for
> >all requests (this is hinted at in RFC2068).  In order to interoperate
> >with HTTP/1.1 servers all HTTP/1.2 clients will have to also include Host:
> >headers.  This is a waste of bandwidth having to include the hostname
> >twice. 
> 
> Such a concern was less significant than the probability that you might
> be sending the absoluteURI request to an old proxy (that doesn't send Host
> but will forward it if present) and the possibility that name-based
> vhosts would fail due to inadequate implementation.  Much of the weird
> wording in the spec is due to several IESG biggies "opinion" of what
> was sufficient to ensure implementation versus my own pleading to
> ensure deployment.

And folks wonder why the standards processes are sneered
at when they refuse to listen to the folks implementing
the protocols.  But I suppose if I'm going to complain about
bandwidth I've got a lot more issues with HTTP than just this.
See <ftp://koobera.math.uic.edu/www/sarcasm/modest-proposal.txt>.

> >Apache 1.2 and 1.3 are broken as far as forward compatibility with this
> >hypothetical HTTP spec as well.  Consider: 
> >
> ><VirtualHost 10.1.1.1>
> >...
> ></VirtualHost>
> >
> >No NameVirtualHost in the config.  I consider the only correct way to
> >implement this config is that *all requests* appearing at 10.1.1.1:80 will
> >be served by that virtual host.  Right now if a request appears there with
> >an absolute URI with a hostname that isn't listed *we will reject it*.
> >This means we're not forward compatible with some lame HTTP version that
> >doesn't exist but is threatened to exist. 
> >
> >Contrast this with the behaviour on a Host: header that we don't
> >recognize... we just don't care about it, we serve what has been
> >configured (a default server, or the ip-vhost).
> 
> Either the server is configured to deal with the global namespace, or
> it ignores the global namespace assuming that every request it receives
> is intended for it.

This is not a requirement of RFC2068.  And it's not a requirement of
Apache either.  Section 1.3:

    ...any server may act as an origin server, proxy, gateway, or tunnel,
    switching behavior based on the nature of each request.

> The requirements in HTTP/1.1 force the server to
> recognize its own namespace as part of the global absoluteURI namespace,
> thereby giving us "training wheels" for the day in which we always use
> the global namespace.  In contrast, the deployment strategy of Host is
> not moving toward the global namespace, and thus whether or not we actually
> check the Host value for non-vhosts is a decision left to the implementers.

I'm happy not checking Host for ip-vhosts, in fact that's what's
implemented now.  Except we use Host for self-redirects when
UseCanonicalName is off.

My complaint is absoluteURI.  Let me examine the possibilities for
recognizing a domain name as your own server (so that you can
determine if it's a proxy request or an origin request):

- DNS lookup, trust the IP returned, and if it matches the local
    connection IP then say "that's me".  This is a seriously broken
    method -- not only for performance, but for denial of service
    and security reasons.  Apache currently DOES implement this,
    and I want to get rid of this.  It's bad doing a forward
    lookup, the attackers control forward lookups.

- Exhaustive listing of servernames.  Apache tries these lists first
    before doing the DNS lookup.  But it is impossible to exhaustively
    list servernames precisely because it is the client that decides
    how a hostname maps to an ip address, not the server.

Insert my rant here about how name-vhosts are an inaccurate protocol.
I'd love something like this inserted into the standard:

    Note that the server may resolve DNS in a different manner than
    the client.  For example, a client mdma.chem.happy.edu requesting
    the unqualified hostname www may resolve it to www.chem.happy.edu.
    If the server happens to be managed by the CS dept, it may resolve
    an unqualified www as www.cs.happy.edu.  A client making a request
    to www then may not result in the correct origin server.  To avoid
    problems such as this clients SHOULD fully qualify all domain names
    specified by the user.

It's completely trivial to do this with the standard unix gethostbyname()
API... the h_name field of struct hostent is fully qualified on any
reasonable system (i.e. not sunos4 or 5 using NIS with poor host maps).
(of course Sun managed to make their servers ubiquitous in academia,
which not only means that academics always say apache has bad performance,
but more relevantly means that the clients really have no idea what the
heck the FQDN is ... 'cause NIS sucks.)

Not that it matters now.

At any rate, to fix the check_fulluri in apache is going to require an
extra field in the request_rec.  When doing check_fulluri apache has no
idea that it's handling an origin request or a proxy request.

rant rant.

Dean

P.S. rant rant:  Section 5.1.2 says:

    In order to avoid request loops, a proxy MUST be able to recognize
    all of its server names, including any aliases, local variations,
    and the numeric IP address.

That doesn't avoid request loops.  That avoids self-loops, which
would be a reasonable requirement of any quality implementation.
A proxy has to do DNS lookups anyhow, so it doesn't bother me that
this requirement absolutely requires DNS to be implemented properly.
But it certainly doesn't avoid loops.  Nothing in the standard enforces
proxy loop avoidance.  Max-Forwards can't do it either, section 14.31:

    The Max-Forwards header field SHOULD be ignored for all other methods
    defined by this specification and for any extension methods for which
    it is not explicitly referred to as part of that method definition.

Thank god a proxy can freely insert a header:

X-Loop: a.b.c.d

and detect loops.


Mime
View raw message