httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeffrey W. Baker" <jwba...@acm.org>
Subject Re: layered I/O (was: cvs commit: ...)
Date Wed, 29 Mar 2000 09:21:09 GMT
On Tue, 28 Mar 2000, Roy T. Fielding wrote:
[ed]
> Layered-IO involves a cascaded sequence of filters that independently
> operate on a continuous stream in an incremental fashion.  Relayed-IO
> is a sequence of processing entities that opportunistically operate
> on a unit of data to transform it to some other unit of data, which
> can then be made available again to the other processing entities.
> The former is called a pipe-and-filter architecture, and the latter
> is called a blackboard architecture, and the major distinctions between
> the two are:
> 
>    1) in layered-IO, the handlers are identified by construction
>       of a data flow network, whereas in relayed-IO the handlers
>       simply exist in a "bag of handlers" and each one is triggered
>       based on the current data state;
> 
>    2) in layered-IO, the expectation is that the data is processed
>       as a continuous stream moving through handlers, whereas in
>       relayed-IO the data is operated upon in complete units and
>       control is implicitly passed from one processor to the next;
> 
>    3) in layered-IO, data processing ends at the outer layer,
>       whereas in relayed-IO it ends when the data reaches a special
>       state of "no processing left to be done".

Forgive me for jumping in here.  Sometimes those of us who are merely
observers of the core group do not get a perspective on the design
discussions that take place in private emails and in person.  Thus, what I
have to say will largely rehash what has been said already.

It seems to me that a well-rounded IO-layering system has already been
proposed here, in bits and pieces, by different people, over the course of
many threads.  The components of the system are: 1. a routine to place the
IO layers in the proper order, 2. a routine to send data between IO
layers, and 3. the layers themselves.

Selection of IO Layers

The core selects a source module and IO layers based on the urlspace
configuration.  Content might be generated by mod_perl, and the result is
piped through mod_chunk, mod_ssl, and mod_net, in turn.  When the content
generator runs, the core enforces that the module set the content type
before the first call to ap_bput.  The content type is set by a function
call.  The function (ap_set_content_type(request_rec *, char *)) examines
the content type and adds IO layers as neccessary.  For server parsed
html, the core might insert mod_include immediately after mod_perl.

(Can anyone produce a use case where the IO chain could change after
output begins?)

Interface Between IO Layers

The core is responsible for marshalling data between the IO layers.  Each
layer registers a callback function ((* ap_status_t)(request_rec *,
buff_vec *)) on which it receives input.  Data is sent to the next layer
using ap_bput(request_rec *, buff_vec *).  The buff_vec is simply an
ordered array of address and length pairs.  Whenever ap_bput is called,
the input callback of the next layer is called.  No message queueing,
async handlers, or any of that business is needed.  ap_bput keeps track of
where in the output chain things are.  Control flow in this systems tends
to yo-yo up and down the IO chain.  Examples later.

The only other part of the IO interface is a flush routine.  The IO layers
are free to implement whatever they feel flushing involves.

There are two notable things about this system.  First, control flow need
not ever reach the end of the output chain.  Any layer is free to return
without calling ap_bput.  The layers can do whatever they please with the
data.  The network module would be such an example.  It would always write
the buffers over the network, and never pass them down the IO chain.  If
mod_ssl wanted to handle networking itself, it could do that, too.  The
second notable thing is that once a buffer has been sent down the chain,
it is gone forever.  Later layers are responsible for freeing the memory
and whatnot.  Diddling in a buffer that has already been sent would be bad
form.

Layer Implementation

This system has implications for the design and implementation of the
layers.  Clearly, it would not be efficient to call ap_bput overly much.  
Also, the IO layers must be re-entrant in the threaded MPMs, so they will
need some mechanism for storing module-specific state information in the
request context (think mod_include when an include directive spans ap_bput
calls).

There will be basically three types of layers: those that insert content
into the stream (chunking, SSI), those that replace the stream completely
(encryption, compression), and those that sink the stream (network).  The
layers all demonstrate minimal copying: the inserting layers merely move
the boundaries on the incoming buffers and insert a new buffer.  The
replacement layers have to create a new buffer and dealloc the old one,
but you can't avoid that in any case.  The sinks merely dealloc the
buffers, so no problems there.

Analysis by Example

I considered two examples when coming up with this design.  One is content
which is dynamically generated by mod_perl, filtered through SSI, chunked,
encrypted, and sent over the wire.  The other is fast static content
serving, where a module is blasting out pre-computed HTTP responses a la
SGI's 10x patches.

In the first situation, imagine that a 10 KB document is generated which
contains two include directives.  The include directives insert a standard
banner and the contents of a 40 KB file.  The generating module outputs
the data via one ap_set_content_type call and five separate ap_bput calls.  
To see the worst case, assume that both include directives span ap_bput
calls.  Assume that the included content does not contain any include
directives.

The IO chain initially looks like this:

mod_perl->mod_chunk->mod_ssl->mod_net

After the content type is set, the chain changes:

mod_perl->mod_include->mod_chunk->mod_ssl->mod_net

During the inclusion of the 40 KB file, mod_include allocates a series of
4 KB buffers, fills them from the file, and sends them down the chain (or
maybe it uses mmap).  The analysis is left to the reader, but the end
result is that ap_bput is called 50 times during the request phase.  Is
that a lot?  Consider the amount of work being done, and the fact that we
have avoided all the overhead of using, for example, actual pipes, or
thread-safe queueing.  Calling functions in a single userland context is
known to be fast.  The number of calls could be reduced if mod_include
used a larger internal buffer, but at the expense of memory consumption
(or it could use mmap).  Note also that the number of ap_bput calls does
not translate into packets on the wire.  mod_net is free to do whatever is
optimal with respect to packet boundaries.

The second example represents high performance static content delivery.  
The content-generating module has all headers and content cached or mapped
in memory.  The entire output phase is accomplished in a single ap_bput
call, and the networking module does The Right Thing to ensure best
network usage.

Am I rambling yet?  I'd like to get some opinions on this system, if
anybody feels it is significantly different from those already proposed.  
I realize that I have waved my hands regarding actually deciding when to
use what IO layers and where, but I am confident that a logically
appealing system could be devised.

Regards, 
Jeffrey

> 
> Yes, these two architectures are similar and can accomplish the
> same tasks, but they don't have the same performance characteristics
> and they don't have the same configuration interface.  And, perhaps
> most significantly, relayed-IO systems are not as reliable because
> it is very hard to anticipate how processing will occur and very easy
> for the system to become stuck in an infinite loop.
> 
> I don't want a blackboard architecture in Apache, regardless of the
> version of the release or how many users might be satisfied by the
> features it can implement.  It is unreliable and hard to maintain
> and adds too much latency to the response processing.  But if somebody
> else really wants such an architecture, and they understand its implications,
> then I won't prevent them from going with that solution -- I just
> don't want them thinking it is what we meant by layered-IO.



Mime
View raw message