Date: Tue, 10 Jun 1997 01:05:43 +0100 (BST)
From: Rob Hartill <robh@imdb.com>
To: new-httpd@apache.org
Subject: Re: Thoughts on a 2.0 API
In-Reply-To: <Pine.HPP.3.95.970609150137.7082A-100000@ace.nueva.pvt.k12.ca.us>
Message-ID: <Pine.NEB.3.96.970610000228.11803B-100000@localhost>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: new-httpd-owner@apache.org
Precedence: bulk
Reply-To: new-httpd@apache.org


Some responses, but I admit I've mostly hijaked Alexei's post to
throw in some of my own thoughts and ideas.

On Mon, 9 Jun 1997, Alexei Kosut wrote:

> Here are some thoughts I've had on the design of an API for Apache
> 2.0. Please feel free to ignore it.

> In earlier discussions, assuming we keep a similar request model in
> 2.0 (which I think is likely), we've discussed having at least the
> following phases:
> 
> 1. "connection open" phase
> 2. begin request phase
> 3. URL->URL translation
> 4. URL->filename translation
> 5. filename->filename translation
> 6. header parse phase

It might be confusing terminology, but I think 'header parsing'
belongs before 3.

I think we need to collect as much information about the request
as early as possible so that any translations that take place can use
that information. e.g. if a header specifies a preference for Greek
over English language then the URL->->->filename translation phase should
be able to act on that.


> 7. access check
> 8. user id check
> 9. auth check
> 10. type check
> 11. fixup phase
> 12. handler phase
> 13. . pre-header phase
> 14. . post-header (pre-body) phase
> 15. end request phase
> 16. logging phase
> 17. end connection phase
> 18. "connection closed" phase
> 
> And I may be missing some we've discussed.

Those should be considered some of the basics. I'd also like to be
able to squeeze any handler I come up with in between any of these
listed above, e.g. I might want to add a handler near phase 2 that
just aborts the connection if the incoming IP is a nuisance site.

> The point is, there are
> (will be) twice as many request phases in 2.0 as in 1.x. We've also
> had problems with the nature of the current model. Currently, most
> phases are run always, with the exception of the handler phase, which
> is keyed off of handler/media type.

Assuming we end up with a form of stacked handlers, I think it'd
be useful to allow any handler to switch off other handlers that follow
it for any particular request. To do that a way is needed for handlers
to uniquely identify themselves.

> What is the solution? How about making the API dynamic? A module_rec
> might just have an initializer phase (or two - we've discussed
> "run-on-fork" initializers as well as the "run-on-start" we have now),
> and a command table (or something). The initialization function would
> call things like:
> 
>    add_request_phase(&handler_func, HANDLER_PHASE, "text/html", M_GET|M_POST);
>    add_request_phase(&type_func, CHECK_TYPE_PHASE, NULL, M_ANY);
> 
> These might also be called from command functions (i.e., putting in
> the first "AddType" command would cause mod_mime to add a check_type
> phase).

mod_perl does something similar. You can push a list of functions (handlers)
into any of the current phases. What I've noticed about the way I (ab)use
that system is that categorising the phases breaks down into a free for
all when you decide at exactly which point you want the function called.
... you end up thinking to hell with what the phase is called and
supposed to do, I want my handler *there*.

If I want to call a bit of code very early then I currently stick it in
at the headerparser phase even though it has nothing to do with parsing
headers as such.

So, what I think it's better to throw out the concept of named phases.
By all means start from the ordered phases we have and propose, but
don't enforce that order on everyone implicitly or explicitly. I should
be allowed to juggle any of the core handlers around to achieve my
objectives without feeling guilty about it.

> Of course, these would be stored like per-dir configs are now,
> and merged for the request. In fact, add_request_phase might even be
> called *during* the request. For example, a connection-open phase that
> activates SSL might add a connection-close phase that turned off SSL
> (I don't know exactly how SSL works, so this might not be a good
> example) -- this way, non-SSL requests wouldn't bother calling the
> connection-close phase's function.

I think mod_perl allows handler to push other handler onto a stack
at runtime. That's a good idea and one I'd like to see in the core.
The trick is to allow the handler to be "pushed" (perhaps a bad description
because it implies a 'stack', how about "shoved" :-) into any point of
the request's chain (linked list ?).

> This would also allow the server to optimize its request handling;
> because the phase functions would be distinct from the modules, it
> would know that it didn't have any check access functions (for
> example), so it wouldn't bother checking for them. It might also solve
> the "which modules comes first?" problem - maybe the add_request_phase
> function might include a priority value (each priority would be
> defined by the API, of course). Maybe even a run-all/run-one
> indication, unlike now, where some phases run all of the functions,
> and some run until they hit an OK (i.e, all the run-all functions
> would run, then the run-one ones).

this could be done if handlers are allowed to switch off later
handlers by having flags for different 'groups' of handler. The
flags can be carried in the request record and any group of handlers
should be able to define new flags and other shareable data in the
request record. The idea being to let handlers "communicate" information
among themselves without needing extra variables/code in the core
to organise things. A common set of these flags could be used, say, to
tell ALL subsequent handlers other than output and logging to skip
processing the request. Some handler's could default to skipping
everything unless they see some flag set...

basically,  chain handlers together in any order the user wants,
let them talk to each other to decide who does what and who does nothing
and keep the core out of the way as much as possible.

--
Rob Hartill                              Internet Movie Database (Ltd)
http://www.moviedatabase.com/   .. a site for sore eyes.