httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Graham Leggett <minf...@sharp.fm>
Subject Re: Httpd 3.0 or something else
Date Tue, 10 Nov 2009 17:54:54 GMT
Greg Stein wrote:

>> Who is "you"?
> 
> Anybody who reads from a bucket. In this case, the core network loop
> when a client connection is ready for writing.

So would it be correct to say that in this theoretical httpd, the httpd
core, and nobody else, would read from the serf bucket?

>> Up till now, my understanding is that "you" is the core, and therefore
>> not under control of a module writer.
>>
>> Let me put it another way. Imagine I am a cache module. I want to read
>> as much as possible as fast as possible from a backend, and I want to
>> write this data to two places simultaneously: the cache, and the
>> downstream network. I know the cache is always writable, but the
>> downstream network I am not sure of, I only want to write to the
>> downstream network when the downstream network is ready for me.
>>
>> How would I do this in a serf model?
> 
> No module *anywhere* ever writes to the network.
> 
> The core loop reads/pulls from a bucket when it needs more data (for
> writing to the network).
> 
> When your cache bucket reads from its interior bucket, it can also
> drop the content into a file, off to the side. Think of this bucket as
> a filter. All content that is read through it will be dumped into a
> file, too.

Makes sense, but what happens when the cache has finished reading the
interior bucket after the first pass through the code?

At this point, my cache needs to make a decision, and before it can make
that decision it wants to know whether upstream is capable of swallowing
the data right now without blocking.

If the answer is yes, I cache the data and pass the data upstream and
wait to be called again immediately, because I know upstream won't block.

If the answer is no, I *don't* pass data upstream (because it would
block from my perspective), and I read from the interior bucket again,
cache some more, and then ask again whether to pass the two data chunks
upstream.

How does my cache get the answer to its question?

And how does my cache code know when it is safe to read from the
interior bucket without blocking?

>> That I understand, but it makes no difference as I see it - your loop
>> only reads from the bucket and jams it into the client socket if the
>> client socket is good and ready to accept data.
>>
>> If the client socket isn't good and ready, the bucket doesn't get pulled
>> from, and resources used by the bucket are left in limbo until the
>> client is done. If the bucket wants to do something clever, like cache,
>> or release resources early, it can't - because as soon as it returns the
>> data it has to wait for the client socket to be good and ready all over
>> again. The server runs as slow as the browser, which in computing terms
>> is glacially slow.
> 
> I'm not sure that I understand you, and that you're familiar with the
> serf bucket model.

You are 100% right, I am not completely familiar with the serf bucket
model, which is why I'm asking these questions.

I figure there are no better people to explain how serf works than they
who wrote serf ;)

> The bucket can certainly cache data as it flows through. No problem
> there. Once the bucket has returned all of its data, it can close its
> file handle or socket or whatever resources it may have.
> 
> Buckets are one-time use, so once it has returned all of its data, it
> can throw out any resources.
> 
> And no... the server does NOT run as slow as the browser. There are N
> browsers connected, and the server is processing ALL of them. One
> single response bucket is running as fast as its client, sure, but the
> server certainly is not idle.

That isn't what I meant.

Imagine big bloated expensive application server, the kind that's
typically built by the lowest bidder.

Imagine this server is fronted by an httpd reverse proxy.

Image at the end of the chain, there is a glacially slow (in computing
terms) browser waiting to consume the response.

A request is processed, and the httpd proxy receives an EOS from the big
bloated application server. Ideally it wants to drop the backend
connection ASAP, no point handing around, but it can't, because the
cleanup for the backend connection is tied to the pool from the request.
And the request pool is only complete when the last byte of the request
has been finally acknowledged by the glacially slow browser.

So httpd, and the big bloated expensive application server, sit around
waiting, waiting and waiting with memory allocated, database connections
left open, for the browser to finally say "got it, gimme some more"
before httpd's event loops goes "that was it,
apr_pool_destroy(serf_bucket->pool), next!".

And the reason why this happened was that all of this was driven by the
core's event loop, timed against the speed of the glacially slow browser.

Obviously a second browser next door is being serviced at same time as
you pointed out, but it too waits, waits, waits for that browser to
eventually acknowledge the end of the request.

This is the reason why people are sticking things like varnish caches
between their servers and the browsers - because the backend can't
terminate early.

I don't believe httpd v3.0 gives us any value if it suffers this same
limitation suffered by httpd v2.x.

I can see us solve this problem simply by making the filter stack non
blocking, and by making content generators event driven. I don't see a
need to rewrite the server.

>> One event loop handling many requests each == event MPM (speed and
>> resource efficient, but we'd better be bug free).
>> Many event loops handling many requests each == worker MPM (compromise).
>> Many event loops handling one request each == prefork (reliable old
>> workhorse).
> 
> These have no bearing. The current MPM model is based on
> content-generators writing/pushing data into the network.
> 
> A serf-based model reads from content-generators.

So httpd's event loop reads from a buggy leaky interior bucket.

How does the server protect itself from becoming unstable?

Regards,
Graham
--

Mime
View raw message