httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From r..@ai.mit.edu (Robert S. Thau)
Subject Re: SSI handlers
Date Mon, 10 Jul 1995 19:49:47 GMT
   Date: Mon, 10 Jul 1995 08:53:54 -0700
   From: Cliff Skolnick <cliff@organic.com>
   Precedence: bulk
   Reply-To: new-httpd@hyperreal.com

This note is hard to respond to, because there are discussions of two
issues here, which have gotten kind of intermingled.  These two issues
(which I see as pretty much distinct) are:

   1) Using a database instead of a filesystem as a server back-end
   2) embedding code (not just new directives, but actual code) in
      server-side include documents, via, e.g., a <TCL> tag.

First, topic #2.

   I'm not sure we will need threading for HTTP-NG.  I don't see why one
   thread can't handle multiple requests at once to be honest.  All you
   really need is some sort of async I/O.  This can be simulated in an OS
   neutral way, unlike threads.  Of course many of the same concerns and issues
   would be the same in the code.

To begin with, my problem with your <TCL> tag has less to do
with multithreading per se than it has to do with simply having
multiple requests served by one process, regardless of how that's
accomplished.  Consider a "document" with the following embedded code:

   <!--#TCL-->
   while {1} {open /dev/null r}
   <!--#/TCL-->

--- I hope you'll pardon me for using my own suggested syntax.

Whatever process winds up serving this document will quickly run out
of file descriptors (too quickly to be caught by a timeout, for
instance) --- at which point all other requests being served by the
same process would be totally screwed.  As I've already pointed out to
you, detecting and recovering from these sorts of situations, in the
general case, is very difficult.  This has nothing to do with
multithreading per se --- it has to do with having one process serving
multiple requests, no matter how that is accomplished.

(You can say that no one would write that sort of thing deliberately,
and at Organic, that may even be true.  But sooner or later, it's
going to happen by accident --- or worse, you'll have a slow leak
which doesn't get detected until the "document" gets loose on your
primary server and brings *it* down.  If you're doing one request per
process, you have exit() as a last-ditch way to escape from these
situations, but if multiple requests are being served by the same
process, and for HTTP-NG they basically have to be, there is no way
out).

As to using asynch I/O, instead of full multithreading --- it is
*possible*.  However, the only way I see to make it actually work
would be to write the entire server (or at least, anything which did
potentially blocking I/O, including all the response handlers) in the
same style as the protocol state machines of a kernel device driver.
(Anytime a request could block, you have to save *all* the information
that would be needed to resume it before you can go do something else
--- at enormous cost to the clarity of your code).  

I have worked on code that was actually written like this.  (It was
control software for some experiments in a biology lab, which ran on a
non-preemptive system and had to deal with real-time constraints, so
this sort of state machine approach was the only option).  I don't
want to repeat the experience, which is why I don't regard this as a
productive option.

----------------------------------------------------------------

Now, on to topic #1 --- skipping point-by-point replies, the heart of
it is:

   I don't expect the Apache group to do this, but I do want to see APIs that
   support a database vendor or an interested third party (like Organic)
   doing this.  This is the future.

Here is a list of ways in which Shambhala is currently wedded to a
filesystem as a back-end:

   1) Translation handlers have to be translating into some sort of a
      namespace; currently, the filesystem is it.

   2) The server core scans the translated pathnames, looking for
      .htaccess files to read per-directory permissions out of.  A
      database-back-ended server would presumably want a similar
      mechanism, but coming up with a suitably general interface
      is extremely difficult.

   3) The response handlers all invariably use fopen() to get at the
      filesystem object whose name popped out of the translation
      handler. 

I've thought fairly long and hard about how to come up with an API
which generalizes all these things, and I can't.  I've then given up on
3), decided that anything which was in the DB back-end would need its
own response handler, and I have a few ideas about how that could work
(it helps to start distinguishing internal object type from the type
that will be served to the client, so that you can dispatch on the
former and do content negotiation on the latter), but it's still quite
messy (particularly #2 --- what interface do you provide to the
command stuff)?

It took me a couple of *months* to come up with clean APIs for what
Shambhala does now --- I was effectively AWOL for quite a bit longer
than people seem to have noticed.  I expect it would take at least
an equivalent amount of time to come up with a good clean design for
this and make it work, and I'm not sure I have the time for that right
now.  Sigh...

rst



Mime
View raw message