httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Fritsch ...@sfritsch.de>
Subject Re: timeout queues in event mpm
Date Sat, 12 Nov 2011 19:24:01 GMT
On Fri, 11 Nov 2011, Paul Querna wrote:
> After r1201149, we now lock for lots of things, where in an ideal
> case, we shouldn't need it.

I agree that that's not optimal, but it is no regression. Event always 
used locking for the timeout queues. But what we really should do in 2.4.0 
is remove all the MPM-implementation specific details from conn_state_t. 
The only field that is actually used outside of the MPMs is 'state'. If we 
make the rest non-public and somehow change the logic for the conn_state_t 
to be created by the MPM, we can switch to an improved implementation of 
timeout queues in 2.4.x.

> I'm toying around with ideas on how to eliminate the need for a mutex at all.
>
> My current 'best' idea I think:
>
> 1) Create a new struct, ap_pollset_operation_and_timeout_info_t, which
> contains a what pollset operation to do (Add, Remove, Flags),
> timeout_type, the timeout value, and a pointer to the conn_rec.
>
> 2) Implement a single-reader single writer circular ring buffer of
> these structures. (Using 2x uint16_t head/end offsets stuffed into a
> single uint32_t so we can make it portability atomic using
> apr_atomic_cas32)
>
> 3) Allocate this ring buffer per-worker.
>
> 4) Have the single Event thread de-queue operations from all the worker threads.
>
> This would remove the 4 or 5 separate timeout queues we have
> developed, and their associated mutex, and basically move all of the
> apr_pollset operations to the single main thread.

But then the main thread would need to do a sorted insert into its 
main timeout queue, which is expensive. Or how would you find out when the 
next timeout is due?

> Without modification to the event mpm, it would potentially cause some
> issues as the event thread isn't always waking up that often, but I
> think we can figure out a way to do this without too much pain (either
> a pipe trigger to wake up when in 'slow mode', or just lower the
> default timeout on the pollset form 500ms to like 5ms).

There would have to be some wakeup. Otherwise it is possible that the 
worker would only write to the socket every 5ms in write completion mode, 
which would hurt the maximum transfer rate big time.

> Thoughts?

BTW, IMHO such major overhaul should be done in a separate branch for now.

Other improvements could include having a per thread struct that contains 
all per-thread data and is aligned to e.g. 512 bytes to avoid cache-line 
ping-pong. Currently the worker_sockets array is causing false sharing. 
But as far as I remember, this was not (yet) relevant for performance, 
because other parts (mainly dir/file walk and config merging) are much 
slower anyway.

Cheers,
Stefan

Mime
View raw message