tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier ...@ice-sa.com>
Subject Re: [OT?] Virtual hosting - does the port need to match the URL port?
Date Tue, 17 May 2011 22:35:22 GMT
sebb wrote:
> HTTP requests include a "Host:" header which generally specifies the
> target hostname and port (omitted if it is the default port).

> AIUI, in virtual hosting situations, the name in the Host header may
> be different from the URL host.
> So for example a request to:
> 
> http://localhost:8080/
> 
> might be sent with the header:
> 
> Host: stimpy:8080
> 
> in order to direct the request to the "stimpy" virtual host.
> 
> Does it ever make sense for the "Host" port to be different from the
> URL port? For example:
> 
> Host: stimpy:9090
> 
> As far as I can tell, Tomcat validates the format of the Host header,
> but otherwise ignores the port?
> Is that correct?
> 
> Does anyone know of other servers that make use of the Host port setting?
> 

With respect (and a willigness to help), I believe that you have a wrong understanding of

the basics of how this all works, and as a consequence your questions and theories above 
are a bit off the mark (if not necessarily off-topic for this list).


When you enter a URL in the URL box of the browser and hit <return>, what happens is
as 
follows.  Let's suppose that the URL in question is :

http://someserver.company.com:8080/something?name1=value1


1) the browser splits the URL into several parts :
- http://  (the "protocol" part)
- someserver.company.com  (the hostname part)
- (optionally) 8080 : (the port, which if not specified is "80" for HTTP)
- the rest : /something?name1=value1
  (which is itself conmposed of different parts, but that is not important here)

2) the browser takes the hostname part "someserver.company.com", and asks its local 
operating system to translate this to an IP address.  The part of the OS which does this 
is called the "resolver", and it makes use the DNS system in order to make this 
name-to-address translation.

3) when the browser has the IP address of the target server, it creates a TCP connection 
*to that IP address*, and to the explicit or implicit port.
The result of this is that there is now a direct connection between the browser and that 
server, over TCP/IP.
And on that server, this connection is handled by whichever process was listening on that

port.  In general, for HTTP, this will be a webserver process (like Apache httpd or Tomcat).

So what the browser writes to that connection, is read by the webserver, and vice-versa 
what the webserver writes to that connection, is received and read by the browser.

4) on this connection, the browser now writes a HTTP request.  That request is composed of

several lines, of which there are at least the following 3 lines :
GET /something?name1=value1 HTTP/1.1
Host: someserver.company.com

(the empty line indicates the end of the request headers, which for a "GET" request is 
also the end of the request)

Now we look at the webserver, which receives this request over the connection.

1) The server reads the request lines, and first looks at the "Host:" line.
That tells it which "virtual server" (or "virtual host") the browser is trying to reach. 
In this case the browser, through that Host: header line, indicated that it is a virtual 
host named "someserver.company.com".

2) The webserver then checks in its configuration, if it really has a virtual host named 
that way.  For Tomcat, that would mean that in its "server.xml" file, there exists a tag 
like :
<Host name="someserver.company.com" ...>.

2a) if there is no such virtual host, then the server will use its "default host" to 
process this request.  For Tomcat, that is the Host (also identified by a <Host ..>
tag, 
whose name is in the <Engine> tag, like :
<Engine name="Catalina" defaultHost="localhost">

2b) if there is a <Host ..> tag where the "name" attribute matches the request "Host:"

header, then the server will pick that Host to answer this request.

3) Now the server looks at the first line of the request again, to see what the browser 
wants inside that selected Host (here it is thus "/something?name1=value1").  In this 
case, it would be a "web application" (or "webapp", or "context"), with the name "/something".
The webserver now passes the whole browser request (first line, other header lines, and 
maybe also a content), to the application "/something" within this Host.
(and in the case of Tomcat, if there is no such application, it will pass the request to 
the "default application" (also named "ROOT")).

4) that application creates a response, and writes it back into the TCP connection.

and now finally the browser reads that response from the connection, and displays it to 
the user in the browser window.


Now the above is really a dramatic summary, and in reality there is a lot more that 
happens between each of the steps above.  I have also taken some liberties with the 
language.  But the above is fundamentally true, for all webservers, not only for Tomcat.
And that is because all browsers and all webservers, with respect to what is exchanged 
over the connection, follow the HTTP protocol, and that is a general Internet standard 
defined independently of any browser and any webserver, and valid for all of them (which 
is the point of a standard of course).

So, to get back to your questions above :


 > HTTP requests include a "Host:" header which generally specifies the
 > target hostname and port (omitted if it is the default port).

According to HTTP RFC 2616, the Host: header MUST be present, and MUST specify the target

hostname.

 > AIUI, in virtual hosting situations, the name in the Host header may
 > be different from the URL host.
 > So for example a request to:
 >
 > http://localhost:8080/
 >
 > might be sent with the header:
 >
 > Host: stimpy:8080
 >
 > in order to direct the request to the "stimpy" virtual host.

If you have understood the above explanation, you will now know that this was wrong.

In order to even connect to the machine that runs the webserver of interest, the browser 
will first need to use a hostname (in the URL) that can be resolved by the OS to a valid 
IP address.  And when it has that, it will make a connection to that IP address (and to 
the port indicated in the URL, or by default to the port 80).
And then, in the request itself that is sends onto that connection, it will /repeat/ that

same hostname in the Host: header.

(RFC 2616 says that a port can be present in the Host: header; but it does not mention 
what the server should do with it.  And I can't think of what it could do with it either,

since by the time the server reads this header, the connection is already established with

the webserver anyway.)

 > Does it ever make sense for the "Host" port to be different from the
 > URL port? For example:
 >
 > Host: stimpy:9090
 >

As far as I know, no.  And there is also no standard browser which would do such a thing.

 > As far as I can tell, Tomcat validates the format of the Host header,
 > but otherwise ignores the port?
 > Is that correct?
 >
Kind of. It will probably ignore the port, because it is irrelevant.

 > Does anyone know of other servers that make use of the Host port setting?
 >
As far as I know, none.





---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message