httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Roy T. Fielding" <field...@avron.ICS.UCI.EDU>
Subject Re: multiple requests per connection (Roy ?)
Date Mon, 01 May 1995 22:30:07 GMT
Argh, it's getting so hard to take a vacation....

Here are my notes on various ways to do keep-alive.  They are rough,
and should not be treated as sacred in any way or form.

Keep-Alive Notes
================

> There have been several proposals for extending HTTP to allow multiple
> requests per TCP connection.  I usually refer to this mechanism as
> keepalive, but a lot of names apply.
> 
> I'm interested in your opinion, since you've been following this discussion
> more closely.  I want to implement a prototype of one of these proposals to
> see just how well it works.
> 
> If you were going to prototype one of these ideas, which would you choose?

All of them.  Seriously, they are all relatively easy to implement, and
I am not sure which one would be best in practice.

There seem to be three ways to accomplish multiple requests:

  A) create a new method that allows a keep-alive session to be set up;
  B) send a request header that indicates a willingness to keep-alive,
         and a response header that indicates the server is staying alive.
  C) define multiple M* methods (or a single MULTI method) that sends
     a list of requests in-bulk and receives a list of responses in-bulk.

The advantage of A is that it is cleaner and more appropriate for those
situations where a true session is desired (as in the case of a permanent
connection to a local proxy, or regional proxies talking to a national
gateway) and it provides a convenient way to establish the defaults
(such as security) for the entire session.  It would also fit well with
the proposals for HTTPng.  The disadvantage is that you get a Bad Method
response from current servers and new servers would have to deal with
things like short-lived session time-outs.

The advantage of B is that it won't generate a Bad Method response.  The
disadvantage is that you can't set up session-level defaults until after
the first request (if at all) and you must guess the capabilities of the
server prior to making that request (the status quo).

The advantage of C is that it retains the simple RPC-like behavior of
current HTTP. This means that the server does not need to worry about
closing the connection just as the client sends a new request.  It also
allows people to define MGET, MHEAD, MPOST, MPUT, ... separately, which
some people think is an advantage (I think its incredibly dumb).  There
are many disadvantages:  it always requires two requests (one to get the
base document, another to get the in-line src's at that site); it requires
the client to know what it wants before the request (i.e., no support for
true content negotiation and URN->URC->URL redirection); it requires
intermediate proxy servers to parse the request into separate ones and
service them in sequence; and, it requires considerably more code to
achieve the same functionality as a session.

There are several orthogonal issues:

  1) how do you know the receiver will accept multiple requests;
  2) how do the multiple requests get initiated;
  3) how do the multiple responses get packaged;
  4) how does the server know when its done.

For (1): A: cache of known servers or get an error message if it doesn't
         B: the server can tell you in the first response
         C: the server can tell you in the prior connection's response

For (2): A: after the initial request, all following requests are normal
         B: all requests are normal
         C: on the second connection, all requests are packaged

For (3): A: response sent after each request, terminated by content-length
            and/or packetized Content-Transfer-Encoding
         B: response sent after each request, terminated by content-length
            and/or packetized Content-Transfer-Encoding
         C: response sent in one concatenated sequence, each response
            delimited by content-length and/or packetized C-T-E

For (4): In all cases, server may close the connection at any time.
         A: client closes the connection or server times-out
         B: last request does not include keep-alive header, 
            client closes the connection, or server times-out
         C: server closes after completing response to M* request.

I think option (B) is the most likely candidate to make it into the 1.1
specification and (A) being added for 2.0.  Henrik is planning a test
implementation of (B) right now.

One key question is how the implementations compare on non-Unix platforms.
Ideally, I would like to be able to *demonstrate* which one is best rather
than spend six months arguing about philosophical differences.

As for implementation specifics:

=========================================================================
Packetized Content-Transfer-Encoding
------------------------------------

HTTP/1.1 will include two new CTE values, "packet" and "packet64",
representing packetized binary and base64 content, respectively.
The packetization structure follows that proposed by Dan Connolly
[excerpted from <9409271503.AA27488@austin2.hal.com> on www-talk]:

   The bytes of a body are written into a body part in packets: each
   packet consists of a length (written out in ASCII characters, to avoid
   byte order and word size issues), a CRLF, and a corresponding number
   of bytes. For example, server code might look like:

   write_response(FILE *body, const char *contentType)
   {
        enum { LF=10, CR=13 };
        char buffer[4000]; /* chosen to match system considerations */
        int bytes;

        printf("HTTP/1.1 200 OK%c%c", CR, LF);
        printf("Content-Type: %s%c%c", contentType, CR, LF);
        printf("Content-Transfer-Encoding: packet%c%c", CR, LF);
        printf("%c%c", CR, LF);

        while((bytes = fread(buffer, 1, sizeof(buffer), body)) > 0){
                printf("%d%c%c", bytes, CR, LF);
                fwrite(buffer, 1, bytes, stdout);
        }

        /* @@ Hmmm... what happens if I get an error reading from body?
         * perhaps negative packet lengths could be used to indicate
         * errors?
         */
        printf("0%c%c", CR, LF);
   }


   The returned data might look like:

        HTTP/1.1 200 OK
        Content-Type: application/octet-stream
        Content-Transfer-Encoding: packet

        4000
        ...4000 bytes of stuff...
        1745
        ...1745 bytes of styff...
        0

   Then the connection would be available for another transaction.

CTE: packet64 would be the same, except that the "bytes of stuff"
would be the base64 encoding of the original bytes.  

=========================================================================
Option A: session-like method
-----------------------------

I suggest calling the method "OPTIONS", with the idea being that the
client would send the request (along with its desired connection options)
and the server's response (headers only) would include the server's
options, including known methods, security options, and session
characteristics.  One issue is whether we should stick with independent
header names or go with a single extension semantics as Dave Kristol
has suggested (though I would implement a less verbose version).

After that, client sends requests as normal and server responds as normal,
except that all responses must have accurate content-length or packet CTE.
The session ends when either side closes the connection.  I suggest a short
5-10 second timeout for client requests, with some experimentation necessary.

=========================================================================
Option B: header-based keep-alive
---------------------------------

Ignore the info on the Connection header in the old 1.0 draft 01.

Instead, define a Connection header as a list of header field names,
and include

   Connection: Keep-Alive
   Keep-Alive: timeout=10, maxreq=5, token=word, ...

on both requests and responses (the response indicating its acceptance
by the server).  After that, client sends requests as normal and server
responds as normal, except that all responses must have accurate
content-length or packet CTE.  The session ends when either side closes
the connection.  The server may close the connection immediately after
responding to a request without a Keep-Alive.  A client can tell if the
connection will close by looking for a Keep-Alive in the response.

> I assume what this means is that the Connection header is sent by the
> client and that the Keep-Alive header is returned by the server.  We don't
> send both across both ways, do we?

I'm not sure -- depends on how you want to implement it, since
nobody else has yet.  In theory, the Connection header would contain a
list of header field names. If the name is also present as a header,
that header would be marked as connection-only. If the name is not a
header, then it is treated as a token without any attributes.

So, the client can send 

   Connection: keep-alive

to indicate that it desires a multiple-request session, and 
the server responds with

   Connection: keep-alive
   Keep-Alive: max=20, timeout=10

to indicate that the session is being kept alive for a maximum of
20 requests and a per-request timeout of 10 seconds.

At least, that's the idea.  You will probably think of additional needs
when it is implemented.

> I note that the stuff on Connection headers has been removed in the latest
> draft.  Is it the case that Connection headers will be defined to not go
> through proxies?

Yes, but I couldn't say that for HTTP/1.0, since it would invalidate
current practice.  Both the Connection header and any other headers named
by it will be required not-to-be-forwarded by HTTP/1.1 proxies, and
HTTP/1.1 applications will not trust "Connection" info that is received
from something calling itself HTTP/1.0.

=========================================================================
Option C: M* method(s)
----------------------
Here is what was proposed by John Franks.  I have several problems with
the proposed syntax (MGET needs a dummy URI, -1 as an EOF, erroneous use
of Accept where Public would be better), but that's nothing compared with
the philosophical problems.  I think a MULTI method will be defined for HTTP/1.1which will
be a package of requests.


   Proposal for an HTTP MGET Method
 
   A much discussed pressing need for HTTP is a mechanism for the client
   to request and the server to deliver multiple files in a single
   transaction. The primary (but not sole) motivation is the need to
   improve efficiency of downloading an HTML file and multiple inline
   images for that file. Here is a concrete proposal for the addition of
   an MGET method to HTTP to meet this need.
   
   _Design Objectives_

    1. The number of transactions to download an HTML file with inline
       images should be minimized.
    2. The client should have the option of downloading all, some or none
       of the inline images. It should be able to request some or all
       files with an "If-Modified-Since:" header.
    3. Additions to the protocol should be simple to implement for
       servers and browsers.
    4. The statelessness of the server should be preserved.
    5. New client / old server and old client / new server transactions
       should work as should new client / old proxy etc.
    6. The server must be able to transmit the multiple files "on the
       fly" without knowing their size in advance and without making a
       copy of the whole package before transmission.
    7. The server must have the option of returning some of the requested
       files while denying access or reporting an error for others. In
       particular, it must return a separate status header for each
       requested file.
   
   
   In order to achieve the second objective above a minimum of two
   transactions will be required. In the first transaction the client
   receives the base HTML document (this is a GET transaction and should
   be identical to HTTP/1.0). The client is then in a position to decide
   which inline images to request. It may want all, some, or none, as it
   may have some cached or it may be incapable of displaying images.
   
   The second transaction is an MGET transaction. The client lists the
   URIs it wants each followed by an "Accepts:" header applicable to that
   URI alone. The client can also provide an "If-Modified-Since:" header
   for any of the requested files which should work like it does in
   HTTP/1.0 The server returns the files with a packet transfer-content
   encoding beginning each packet with the exact number (in ASCII) of
   bytes in that packet and an empty packet (size -1) to indicate end of
   file. Here is an example of a client server exchange.
   


        C: GET /foo.html HTTP/2.0<CRLF>
           Accept: text/html, text/plain<CRLF>
        S: HTTP/2.0 200 Success<CRLF>
           Content-Type: text/html<CRLF>
           Accept: GET, MGET, POST, HEAD, MHEAD         <-- see note (*) below
           ...<other headers>
           <CRLF>
           <sends file>
           < closes connection>


  [This was a GET request, identical to HTTP/1.0 except for the additional
  header line "Accept: MGET..." from the server.  The second request uses
  MGET.]

        C: MGET HTTP/2.0<CRLF>
           URI: /images/bar1.gif<CRLF>
           If-Modified-Since: Saturday, 29-Oct-94 20:04:01 GMT
           Accept: image/gif, image/x-xbm<CRLF>
           URI: /images/bar2.gif<CRLF>
           If-Modified-Since: Saturday, 29-Oct-94 20:04:01 GMT
           Accept: image/gif, image/x-xbm<CRLF>
           URI: /images/bar3.gif<CRLF>
           Accept: image/gif, image/x-xbm<CRLF>
           URI: /images/bar4.gif<CRLF>
           Accept: image/gif, image/x-xbm<CRLF>
           <CRLF>

        S: HTTP/2.0 200 Success<CRLF>
           URI: /images/bar1.gif<CRLF>
           Content-Type: image/gif<CRLF>
           Content-Transfer-Encoding: packet<CRLF>
           <CRLF>
           8000<CRLF>
           ... 8000 bytes of image data first packet...
           2235<CRLF>
           ... 2235 bytes of image data completing file...
           -1<CRLF>
           <CRLF>
           HTTP/2.0 304 Not Modified<CRLF>
           URI: /images/bar2.gif<CRLF>
           Expires: Saturday, 29-Oct-95 20:04:01 GMT
           <CRLF>
           HTTP/2.0 403 Forbidden
           URI: /images/bar3.gif<CRLF>
           <CRLF>
           HTTP/2.0 200 Success<CRLF>
           URI: /images/bar4.gif<CRLF>
           Content-Type: image/gif<CRLF>
           Content-Transfer-Encoding: packet<CRLF>
           <CRLF>
           150213<CRLF>
           ... 150213 bytes of image data (complete file)...
           -1<CRLF>
           <CRLF>
         S: <closes connection>


   
   
   This seems to me to meet all the objectives listed above. Comments are
   welcome.

   Note (*):  This line was added as a result of recent discussions on
   the http-wg mailing list.  The presence of MGET notifies the client
   that the server will accept the MGET method.  This should be ignored
   if the client is communicating with a proxy, because the proxy might
   not be MGET capable.  Perhaps a new proxy should add an additional
   header like "Proxy-Allow: MGET etc..." if it wishes to allow the
   client to use MGET.



   John Franks     Dept of Math. Northwestern University
                   john@math.nwu.edu

=========================================================================

That's about it.  Have fun,

.......Roy

Mime
View raw message