Received: by taz.hyperreal.com (8.6.12/8.6.5) id GAA26003; Tue, 19 Dec 1995 06:22:35 -0800 Received: from cass41 by taz.hyperreal.com (8.6.12/8.6.5) with SMTP id GAA25994; Tue, 19 Dec 1995 06:22:13 -0800 Received: from mamba.ast.cam.ac.uk by cass41 with smtp (Smail3.1.29.1 #9) id m0tS2tn-000CLyC; Tue, 19 Dec 95 14:20 GMT Received: by mamba.ast.cam.ac.uk (Smail3.1.29.1 #9) id m0tS2tm-0000miC; Tue, 19 Dec 95 14:20 GMT Message-Id: Date: Tue, 19 Dec 95 14:20 GMT From: drtr@ast.cam.ac.uk (David Robinson) To: new-httpd@hyperreal.com Subject: Re: Generalising Connections Content-Length: 4277 Sender: owner-new-httpd@apache.org Precedence: bulk Reply-To: new-httpd@apache.org Ben wrote: >As I have mentioned before, the problem with Apache and modules which want to >take over the data transport, is that Apache knows that a connection is a file >descriptor. This is not generally true - especially under non-Unix OSes, where >typically even a plain ordinary TCP/IP connection is not a file descriptor. > >Making connections totally generalised is a non-trivial task, but there is at >least one thing that is clear: the "client" and "request_in" members of >conn_rec must go (see httpd.h). This implies that all functions that use them >need changing, and all low-level functionality (e.g. read, write, open, close) >must be supplied in a modular way. > >This is, of course, similar to the way that modules work, but not quite the >same; the function table should be associated with each connection (this >allows dynamic matching of transports to connection, and layering), rather >than being part of a static list. > >I propose a scheme along these lines; we have a transport function table: > >typedef struct connection connection; >typedef struct transport_fn_table transport_fn_table; > >struct transport_fn_table > { > int (*write)(connection *conn,const char *buf,int n); > int (*read)(connection *conn,char *buf,int n); >/* etc... */ > }; > >client and request_in in conn_rec are replaced by: > > connection *conn; > >and a connection looks like: > >struct connection > { > void *info; // private data for the particular type of connection > transport_fn_table *fn; > }; > >Simple, huh? Add a few macros, and the whole thing is (nearly) transparent to >the ordinary module, for example: > >#define conn_write(conn,buf,n) (conn)->fn->write(conn,buf,n) > >Of course, C++ fans will note how much neater this would be in C++. > >The reason I intended to write about this in conjunction with CVS is simple; >with the patch and vote system it could take a long and painful time to get >this change in. It'll be a good test of the efficacy of CVS trying to get such >a global change done. I've been giving this a lot of thought recently. The main problem with your scheme is that is insufficiently modularised. Instead, I would suggest something based loosely on the SVR4 STREAMS interface; i.e. allow multiple modules to intercept the data flowing to/from the client. A stream is a sequence of, err, 'boxes' (the standard name of 'modules' would be confusing for Apache). The active handler talks to the box at the head of the stream. To output data to the client, the handler sends a message to the stream head by calling its put() routine. This box then passes the message downstream by calling the next box's put() routine. This repeats until a box actaully sends the data. /----------\ | mod_asis | Handler routine \----------/ | /|\ \|/ | +----------+ | box_tr | Stream head +----------+ | /|\ \|/ | +----------+ | box_http | Driver +----------+ Boxes can be added to the stream head by 'pushing' them onto the stream. mod_include.c could be usefully re-written as a box (assuming the problem of saved state could be solved). Example: a CGI request which returns a server-parsed document. 1. A connection is created. 2. A stream for the connection is created, containing a basic 'I/O' box. 3. Other boxes are pushed onto the stream, e.g. one for chunked encoding on a persistant connection. 4. A request is received; the stream is duplicated and initialised for this request. 5. The document type of the script is found to be text/x-server-parsed-html so box_include is pushed onto the stream. 6. The CGI modules runs the script and sends its output down the stream. 7. The per-request stream is closed down. It isn't really that complicated. The only tricky part is what to do about the HTTP message headers; are they sent down the stream as well? (Memo: in the chunked content encoding, extra HTTP headers can be sent to the client _after_ the object body has been sent.) David.