httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Eissing <stefan.eiss...@greenbytes.de>
Subject buckets and connections (long post)
Date Wed, 21 Oct 2015 14:18:51 GMT
(Sorry for the long post. It was helpful for myself to write it. If this does not
 hold your interest long enough, just ignore it please.)

As I understand it - and that is incomplete - we have a usual request processing like this:

A)
worker:
  conn <--- cfilter <--- rfilter
     |--b-b-b-b-b-b-b-b...

with buckets trickling to the connection through connection and request filters, state being
held on the stack of the assigned worker.

Once the filters are done, we have

B)
  conn 
     |--b-b-b-b-b...

just a connection with a bucket brigade yet to be written. This no longer needs a stack. The
worker can (depending on the mpm) be re-assigned to other tasks. Buckets are streamed out
based
on io events (for example).

To go from A) to B), the connection needs to set-aside buckets, which is only real work for
some particular type of buckets. Transient ones for example, where the data may reside on
the 
stack which is what we need to free in order to reuse the worker.

This is beneficial when the work for setting buckets aside has much less impact on the system
than keeping the worker threads allocated. This is especially likely when slow clients are
involved
that take ages to read a response.

In HTTP/1.1, usually a response is fully read by the client before it makes the next request.
So,
at least half the roundtrip time, the connection will be in state

C)
  conn 
     |-

without anything to read or write. But when the next request come in, it gets assigned a worker
and is
back in state A). Repeat until connection close.

Ok, so far?


How good does this mechanism work for mod_http2? On the one side it's the same, on the other
quite different.

On the real, main connection, the master connection, where the h2 session resides, things
are
pretty similar with some exceptions:
- it is very bursty. requests continue to come in. There is no pause between responses and
the next request.
- pauses, when they happen, will be longer. clients are expected to keep open connections
around for
  longer (if we let them).
- When there is nothing to do, mod_http2 makes a blocking read on the connection input. This
currently
  does not lead to the state B) or C). The worker for the http2 connection stays assigned.
This needs
  to improve.

On the virtual, slave connection, the one for HTTP/2 streams, aka. requests, things are very
different:
- the slave connection has a socket purely for the looks of it. there is no real connection.
- event-ing for input/output is done via conditional variables and mutex with the thread working
on
  the main connection
- the "set-aside" happens, when output is transferred from the slave connection to the main
one. The main
  connection allows a configurable number of maximum bytes buffered (or set-aside). Whenever
the rest
  of the response fits into this buffer, the slave connection will be closed and the slave
worker is
  reassigned. 
- Even better, when the response is a file bucket, the file is transferred, which is not counted

  against the buffer limit (as it is just a handle). Therefore, static files are only looked
up 
  by a slave connection, all IO is done by the master thread.

So state A) is the same for slave connections. B) only insofar as the set-aside is replaced
with the 
transfer of buckets to the master connection - which happens all the time. So, slave connections
are
just in A) or are gone. slave connections are not kept open.


This is the way it is implemented now. There may be other ways, but this is the way we have.
If we
continue along this path, we have the following obstacles to overcome:
1. the master connection probably can play nicer with the MPM so that an idle connection uses
less
   resources
2. The transfer of buckets from the slave to the master connection is a COPY except in case
of
   file buckets (and there is a limit on that as well to not run out of handles).
   All other attempts at avoiding the copy, failed. This may be a personal limitation of my
APRbilities.
3. The amount of buffered bytes should be more flexible per stream and redistribute a maximum
for 
   the whole session depending on load.
4. mod_http2 needs a process wide Resource Allocator for file handles. A master connection
should
   borrow n handles at start, increase/decrease the amount based on load, to give best performance
5. similar optimizations should be possible for other bucket types (mmap? immortal? heap?)
6. pool buckets are very tricky to optimize, as pool creation/destroy is not thread-safe in
general
   and it depends on how the parent pools and their allocators are set up. 
   Early hopes get easily crushed under load.
7. The buckets passed down on the master connection are using another buffer - when on https://
-
   to influence the SSL record sizes on write. Another COPY is not nice, but write performance
   is better this way. The ssl optimizations in place do not work for HTTP/2 as it has other
   bucket patterns. We should look if we can combine this into something without COPY, but
with
   good sized SSL writes.


//Stefan



> Am 21.10.2015 um 00:20 schrieb Jim Jagielski <jim@jaguNET.com>:
> 
> Sorry for not being on-list a bit lately... I've been a bit swamped.
> 
> Anyway, I too don't want httpd to go down the 'systemd' route: claim
> something as broken to explain, and justify, ripping it out
> and re-creating anew. Sometimes this needs be done, but not
> very often. And when stuff *is* broken, again, it's best to
> fix it than replace it (usually).


Mime
View raw message