httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dean Gaudet <>
Subject Re: work in progress: mpm-3.tar.gz (fwd)
Date Fri, 18 Jun 1999 16:29:25 GMT
On Fri, 18 Jun 1999, Zeev Suraski wrote:

> Is new-httpd moderated?  Looks like every letter I send gets censored,
> which is really weird, since I see all sorts of crap on the list, whereas
> my posts are usually technical...
> Anyway, I'm sending this to you directly to ensure you get a chance to
> look at it...

It only allows subscribers to post... are you using the same subscription
addr as you are for sending messages to the list?

Brian Behlendorf deals with the non-subscriber posts, and he sometimes
lags by a few days... 

> On Thu, 17 Jun 1999, Dean Gaudet wrote:
> > We will impose an additional restriction on modules -- if threads are in
> > use, they may not make any assumption that the same thread will be used to
> > process all phases of a request.  Put another way -- thread local storage
> > is useless... and there will be no "thread_init_hook" function to tell
> > modules when threads have been created.  This restriction is to give us
> > access to hybrid async/sync techniques.  Modules needing information
> > persisting between request phases should use request-specific data
> > (or connection-specific data).
> Ouch, that's a very agressive restriction.  It pretty much requires any
> module that uses local storage to be Apache specific, since it would have
> to save information in Apache's per-request or per-connection structure
> (not to mention it would have to pass pointers to these structures all
> over to any function that may require access to these globals, which is
> terrible). 

Apache already passes around a request_rec pretty much everywhere... I
suppose we can implement apache-specific "thread local storage", which we
save and restore if we ever switch threads... but for first implementation
I'm really not going to worry about it...

> I really urge you to reconsider this.  For PHP 4.0 (Zend), I've written a
> platform independent local storage resource manager (that works very well
> in the threaded ISAPI/IIS4 environment), but the whole approach will be
> renedered completely useless with such restrictions, since it's based on
> the thread id, and obviously expects that all steps and hooks are called
> within the same thread

You could base your storage off the conn_rec * instead of a thread_id... 
Also, I was planning on building a connection_id which is a densely packed
small integer, because the scoreboard will need something like this. 

The main case I'm considering this for is the handler phase.  In general,
any request goes through a bunch of protocol stages and reaches the
handler, and from there it fits into a few small categories:

1. copy a file fd back to the client
2. copy a pipe/socket fd (from another process) back to the client 
3. copy a mmapped region back to the client
4. copy a dynamically generated memory region back to the client
5. the handler writes stuff to a BUFF, and its sent to the client

1, 2, 3, and 4 are very simple cases where if the stuff to be sent doesn't
fit in the socket's send buffer, and we have to block the thread serving
the response.  At this point we're potentially consuming an expensive
resource (a thread, stack, kernel memory for the thread, ...) just to wait
for the client.

Instead we can switch to an asynchronous behaviour, all of 1, 2, 3, 4 are
obvious -- the handler is essentially in a loop which we all know the
structure of, because the entire object is already generated somewhere.
The handler at this point sets up a special new record in the conn_rec,
and will return with a special return code indicating the switch to
asynchronous behaviour.

If the MPM supports async stuff, then it will release this thread from
serving the rest of this request... otherwise there'll be a library to
handle the "async" stuff synchronously.  This async stuff will be
completed using select/poll/non-blocking i/o (or other similar variants,
there are several other faster methods on other platforms).  We've freed
up the expensive resource:

- consume less CPU per client
- handle thousands upon thousands of long haul slow clients, because
  they're just a conn_rec/request_rec at this point... no kernel stack, no
  context switching... really, just minimal resource consumption
- do better on existing benchmarks, but more importantly, do better on
  real world problems

At some point in the future the async stuff will finish, and a caller
supplied "completion" function will be called in another thread.  This
gets us back out of the async core, and into protocol code, which will
"resume" the handler.  This way the async core really has no knowledge of
the protocols involved -- and we can use this technique for any protocol
(and for variations on 1, 2, 3, 4)...

5. is the case for modules which don't want to take advantage of the async
features.  But we can give them help by turning them into case 4 with more
features for BUFF... such as buffer up to 50k responses, and do the async
thing, otherwise do it synchronously. 

> Frankly, with such restrictions, I'm not sure how
> something at the complexity of a scripting language can be implemented as
> an Apache module.  If it can, it would have to tie the implementation to
> Apache very closely (PHP 4.0's implementation actually allows the same DLL
> or library to be used for the CGI version, Apache version and IIS version,
> with thin server-specific wrappers;  with such restrictions, it doesn't
> seem possible). 
> If I'm missing something obvious, please enlighten me :)

I think you're missing something slightly non-obvious :)  Or I'm still
missing your point...

In essence I'm saying that the "thread local storage" is part of the
conn_rec (or request_rec, whichever is most convenient for you).  All
entry points into your module include a request_rec structure -- you can
fetch a void * pointer from your request_data entry; you can store
whatever you need there. 

So think of the thread as a resource which happens to execute your code
for a while, but think of the conn_rec/request_rec as your indication of
what is going on.

Apache *could* support "thread local storage", and if this really bothers
you then I'll encourage you to supply a patch.  We can change the
requirement this way:

- MPMs which do not guarantee to use the same thread for all request
phases must save and restore the thread local storage across such changes

... but this is really hard to do portably -- unless we require all
modules to go through an apache, portable thread local storage API.  Which
means I'd rather it wait for APR, or rather someone else take care of
it... 'cause the stuff which I'm working on is busy enough already :)


View raw message