httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Stein <gst...@gmail.com>
Subject Re: Httpd 3.0 or something else
Date Mon, 09 Nov 2009 20:08:54 GMT
On Mon, Nov 9, 2009 at 14:21, Paul Querna <paul@querna.org> wrote:
>...
> I agree in general, a serf-based core does give us a good start.
>
> But Serf Buckets and the event loop definitely do need some more work
> -- simple things, like if the backend bucket is a socket, how do you
> tell the event loop, that a would block rvalue maps to a file
> descriptor talking to an origin server.   You don't want to just keep
> looping over it until it returns data, you want to poll on the origin
> socket, and only try to read when data is available.

The goal would be that the handler's (aka content generator, aka serf
bucket) socket would be process in the same select() as the client
connections. When the bucket has no more data from the backend, then
it returns "done for now". Eventually, all network reads/writes
finalize and control returns to the core loop. If data comes in the
backend, then the core opens and that bucket can read/return data.

There are two caveats that I can think of, right off hand:

1) Each client connection is associated with one bucket generating the
response. Ideally, you would not bother to read that bucket
unless/until the client connection is ready for reading. But that
could create a deadlock internal to the bucket -- *some* data may need
to be consumed from the backend, processed, and returned to the
backend to "unstick" the entire flow (think SSL). Even though nothing
pops out the top of the bucket, internal processing may need to
happen.

2) If you have 10,000 client connections, and some number of sockets
in the system ready for read/write... how do you quickly determine
*which* buckets to poll to get those sockets processed? You don't want
to poll 9999 idle connections/buckets if only one is ready for
read/write. (note: there are optimizations around this; if the bucket
wants to return data, but wasn't asked to, then next-time-around it
has the same data; no need to drill way down to the source bucket to
attempt to read network data; tho this kinda sets up a busy loop until
that bucket's client is ready for writing)

Are either of these the considerations you were thinking of?

I can certainly see some kind of system to associate buckets and the
sockets that affect their behavior. Though that could get pretty crazy
since it doesn't have to be a 1:1 mapping. One backend socket might
actually service multiple buckets, and vice-versa.

> I am also concerned about the patterns of sendfile() in the current
> serf bucket archittecture, and making a whole pipeline do sendfile
> correctly seems quite difficult.

Well... it generally *is* quite difficult in the presence of SSL,
gzip, and chunking. Invariably, content is mangled before hitting the
network, so sendfile() rarely gets a chance to play ball.

But if you really are just dealing with plain files (maybe prezipped),
then the read_for_sendfile() should be workable. Most buckets can't do
squat with it, and should just use a default function. But the file
bucket can return a proper handle.
(and it is entirely possible/reasonable that the signature should be
adjusted to simplify the process)

Cheers,
-g

Mime
View raw message