Received: (from majordom@localhost) by hyperreal.com (8.8.5/8.8.5) id RAA26970; Mon, 9 Jun 1997 17:20:38 -0700 (PDT) Received: from nora.pcug.co.uk (Nora.PCUG.CO.UK [192.68.174.71]) by hyperreal.com (8.8.5/8.8.5) with SMTP id RAA26965 for ; Mon, 9 Jun 1997 17:20:34 -0700 (PDT) Received: from imdb.demon.co.uk by nora.pcug.co.uk id aa15078; 10 Jun 97 1:19 BST Date: Tue, 10 Jun 1997 01:05:43 +0100 (BST) From: Rob Hartill X-Sender: robh@localhost To: new-httpd@apache.org Subject: Re: Thoughts on a 2.0 API In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: new-httpd-owner@apache.org Precedence: bulk Reply-To: new-httpd@apache.org Some responses, but I admit I've mostly hijaked Alexei's post to throw in some of my own thoughts and ideas. On Mon, 9 Jun 1997, Alexei Kosut wrote: > Here are some thoughts I've had on the design of an API for Apache > 2.0. Please feel free to ignore it. > In earlier discussions, assuming we keep a similar request model in > 2.0 (which I think is likely), we've discussed having at least the > following phases: > > 1. "connection open" phase > 2. begin request phase > 3. URL->URL translation > 4. URL->filename translation > 5. filename->filename translation > 6. header parse phase It might be confusing terminology, but I think 'header parsing' belongs before 3. I think we need to collect as much information about the request as early as possible so that any translations that take place can use that information. e.g. if a header specifies a preference for Greek over English language then the URL->->->filename translation phase should be able to act on that. > 7. access check > 8. user id check > 9. auth check > 10. type check > 11. fixup phase > 12. handler phase > 13. . pre-header phase > 14. . post-header (pre-body) phase > 15. end request phase > 16. logging phase > 17. end connection phase > 18. "connection closed" phase > > And I may be missing some we've discussed. Those should be considered some of the basics. I'd also like to be able to squeeze any handler I come up with in between any of these listed above, e.g. I might want to add a handler near phase 2 that just aborts the connection if the incoming IP is a nuisance site. > The point is, there are > (will be) twice as many request phases in 2.0 as in 1.x. We've also > had problems with the nature of the current model. Currently, most > phases are run always, with the exception of the handler phase, which > is keyed off of handler/media type. Assuming we end up with a form of stacked handlers, I think it'd be useful to allow any handler to switch off other handlers that follow it for any particular request. To do that a way is needed for handlers to uniquely identify themselves. > What is the solution? How about making the API dynamic? A module_rec > might just have an initializer phase (or two - we've discussed > "run-on-fork" initializers as well as the "run-on-start" we have now), > and a command table (or something). The initialization function would > call things like: > > add_request_phase(&handler_func, HANDLER_PHASE, "text/html", M_GET|M_POST); > add_request_phase(&type_func, CHECK_TYPE_PHASE, NULL, M_ANY); > > These might also be called from command functions (i.e., putting in > the first "AddType" command would cause mod_mime to add a check_type > phase). mod_perl does something similar. You can push a list of functions (handlers) into any of the current phases. What I've noticed about the way I (ab)use that system is that categorising the phases breaks down into a free for all when you decide at exactly which point you want the function called. ... you end up thinking to hell with what the phase is called and supposed to do, I want my handler *there*. If I want to call a bit of code very early then I currently stick it in at the headerparser phase even though it has nothing to do with parsing headers as such. So, what I think it's better to throw out the concept of named phases. By all means start from the ordered phases we have and propose, but don't enforce that order on everyone implicitly or explicitly. I should be allowed to juggle any of the core handlers around to achieve my objectives without feeling guilty about it. > Of course, these would be stored like per-dir configs are now, > and merged for the request. In fact, add_request_phase might even be > called *during* the request. For example, a connection-open phase that > activates SSL might add a connection-close phase that turned off SSL > (I don't know exactly how SSL works, so this might not be a good > example) -- this way, non-SSL requests wouldn't bother calling the > connection-close phase's function. I think mod_perl allows handler to push other handler onto a stack at runtime. That's a good idea and one I'd like to see in the core. The trick is to allow the handler to be "pushed" (perhaps a bad description because it implies a 'stack', how about "shoved" :-) into any point of the request's chain (linked list ?). > This would also allow the server to optimize its request handling; > because the phase functions would be distinct from the modules, it > would know that it didn't have any check access functions (for > example), so it wouldn't bother checking for them. It might also solve > the "which modules comes first?" problem - maybe the add_request_phase > function might include a priority value (each priority would be > defined by the API, of course). Maybe even a run-all/run-one > indication, unlike now, where some phases run all of the functions, > and some run until they hit an OK (i.e, all the run-all functions > would run, then the run-one ones). this could be done if handlers are allowed to switch off later handlers by having flags for different 'groups' of handler. The flags can be carried in the request record and any group of handlers should be able to define new flags and other shareable data in the request record. The idea being to let handlers "communicate" information among themselves without needing extra variables/code in the core to organise things. A common set of these flags could be used, say, to tell ALL subsequent handlers other than output and logging to skip processing the request. Some handler's could default to skipping everything unless they see some flag set... basically, chain handlers together in any order the user wants, let them talk to each other to decide who does what and who does nothing and keep the core out of the way as much as possible. -- Rob Hartill Internet Movie Database (Ltd) http://www.moviedatabase.com/ .. a site for sore eyes.