httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexei Kosut <ako...@nueva.pvt.k12.ca.us>
Subject Re: this host crap
Date Thu, 25 Apr 1996 00:31:44 GMT
On Wed, 24 Apr 1996, Brian Behlendorf wrote:

> HYPERREAL.COM
> WWW.APACHE.ORG
>   (okay, not that bizarre, but a reminder that we need to account for 
>   capitalization)

We do.

> bong.com:80
> hyperreal.com:80
> taz.hyperreal.com:80
>   (a reminder that we need to deal with port #'s - I'd prefer to in this 

We do.

>   case sent back a 302 Location: (or is it 303?) removing the redundant 
>   :80 URL)

Bad idea. According to the HTTP/1.1 spec, the Host header is not
entirely related to the URL the client typed in. Namely, it's
host:port, with port defaulting to 80. So a client would be perfectly
within its rights sending "Host: www.apache.org:80", even if the user
typed in or was linked to "http//www.apache.org/".

> vrml.wired.com.
> www.apache.org.
> www.hyperreal.com.
> 
>   (a reminder that it looks like we'll have to handle ending-periods too)

Rgph. We don't do that. I suppose we could. I'll think about it.

> hyperreal.com:
> 
>    (is this legal?)

I don't think so. But we handle it correctly.

> hyperreal.com:70
>  
>    (this is coming from a lot of different UA's and hosts, all the UA's 
>    have "via proxy gateway  CERN-HTTPD/3.0 libwww/2.17" attached, I 
>    suppose the CERN proxy has a problem with "http://hyperreal.com:70", 
>    huh?)

But, we already knew that, yes... ?

> linux
> linux.cis.nctu.edu.tw
> www.ukweb.com
> 
>   These are the most bizarre.  The actual logfile entries claim they are 
>   coming from "linux.cis.nctu.edu.tw" and "www.ukweb.com" respectively, 
>   and they appear to be robotic in nature, yet they have the User-Agent 
>   set to valid Mozilla user-agents (like Mozilla/3.0b2 (X11; I; Linux 
>   1.3.94 i586), sometimes i486)) sometimes going via a proxy server, 
>   sometimes not.  I'm almost wondering if this is a bug of some sort - 
>   I've sent mail to Mark and to the .tw mirror maintainer (this is a 
>   mirror I had not been informed of, and isn't on our pages yet) to see 
>   if it's really from them, or if some sort of corruption from the 
>   Referer: field is coming in somehow.

I sure hope not... Could be someone was poring over a spec, came
across Host and misinterpreted it to mean the browser's hostname... I
hope not.

> www-cache.funet.fi
> 
>   Apparently the cache at www-cache.funet.fi (which doesn't appear to 
>   identify itself in the user-agent header, maybe it does in the 
>   Forwarded: header, I don't know) decided to add a "Host: 
>   www-cache.funet.fi".  The browser which sent this was NCSA_Mosaic/2.7b3 (X11;IRIX 5.3
IP19)
>   Maybe it was the browser, but I saw lots of other NCSA_Mosaic/2.7b3 
>   X11's which appeared to handle proxies without a problem. No, wait - 
>   this was also the cause of another bogus Host: header, "fgw".  I 
>   haven't seen any requests from NCSA_Mosaic/2.7b4 through a proxy yet, so
>   maybe this is a bug in XMosaic.  You folks at NCSA want to look at this?

Here's my bet: NCSA Mosaic 2.7b3, when talking to a proxy, sends a
Host header with the proxy's name. This could explain
www-cache.funet.fi, the .tw and ukweb ones, and even fgw - if it's an
internal name of a proxy.

> www.sandbox.net
> 
>   Okay, so both "Mozilla/2.01Gold (Win95; I)" and "Mozilla/2.0 
>   (Macintosh; I; 68K)" sent this erroneously - the URL in these cases was 
>   (get ready)
>   
>   http://www.sandbox.net/cyberhunt2/prot-bin/webfilter/www.lycos.com:80/cgi-bin/pursuit?query=faberge+eggs

Hmm.

>   http://www.sandbox.net/cyberhunt2/prot-bin/webfilter/www.lycos.com/cgi-bin/nph-randurl/cgi-bin/largehostpursuit1.html?query=relic&maxhits=20

Hmm hmm.

>   This is a protected service so I can't see what type of response these 
>   people really got - both requests came from "www.tracer.com".  Maybe 
>   Netscape 2.0 doesn't change the value of the Host: header after a 302 
>   or 303 redirect?

Could be. Or it could be a Netscape clone...

> www.webville.com
> 
>   6 bogus requests were made, all from the same remote host and with the same
>   client (Mozilla/2.0 (Win16; I)), with the referer being
>   "http://www.webville.com/oak/Marco-25/archive.html".  Looking at that 
>   page, there are references to hyperreal in addition to lots of other 
>   places, but I don't see anything that should explicitly trigger such a 
>   bogus request.  

Don't have a clue about that one.

> >From all of these, I get the feeling that handling bogus Host: headers is 
> going to be an interesting situation.  Since the migration path will not 
> be smooth, one option I'd like to have is to be able to, on the absence 
> of a Host: header or the existance of a bogus one, return an error, 
> something like "Malformed Request".  Roy will no doubt have opinions on 
> this.  :)

This is not neccessary. Malformed headers, if they don't pass muster, and are
treated like they didn't exist... just make all your servers
VirtualHosts, and make the "main" server just a page that says "hey,
you, get a browser that supports Host: correctly."

If you want them seperately, that's something different.

-- 
________________________________________________________________________
Alexei Kosut <akosut@nueva.pvt.k12.ca.us>    
URL: http://www.nueva.pvt.k12.ca.us/~akosut/  
Lefler on IRC, DALnet <http://www.dal.net/>   


Mime
View raw message