httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luca Toscano <toscano.l...@gmail.com>
Subject Re: Thundering herd and MPMs (for dummies)
Date Sun, 24 Apr 2016 22:30:40 GMT
2016-04-18 8:47 GMT+02:00 Luca Toscano <toscano.luca@gmail.com>:

> Hi Yann!
>
> 2016-04-16 14:20 GMT+02:00 Yann Ylavic <ylavic.dev@gmail.com>:
>
>> On Sat, Apr 16, 2016 at 2:17 PM, Yann Ylavic <ylavic.dev@gmail.com>
>> wrote:
>> > Hi Luca,
>> >
>> > On Sat, Apr 16, 2016 at 12:07 PM, Luca Toscano <toscano.luca@gmail.com>
>> wrote:
>> >> The sockets are non blocking and without any guard before the
>> >> apr_pollset_poll (between processes I mean) there might be the risk of
>> >> having two or more listener threads trying to accept the same new
>> >> connection, ending up in only one proceeding and the rest getting
>> EAGAIN.
>> >
>> > On modern systems, the thundering hurd is not an issue anymore (does
>> > not happen).
>> > There won't be multiple listeners (threads or processes) woken up at
>> > the same time for the same incoming connection when
>> > epoll_wait()/kevent()/... are used (see the corresponding man pages,
>> > EAGAIN is not a possible error, while it is for eg. poll()).
>> > So when accept() is called, we can be sure , so a fortiori for
>> epoll()+accept().
>>
>> [Sorry, unexpected send...]
>> So when accept() is called, we can be sure that a connection is available.
>>
>> >
>> > Since, as you noticed, mpm_event is meant for modern systems, not
>> > ACCEPT_MUTEX is implemented.
>>
>
> Thanks a lot for the answer, it makes more sense now but I still have some
> doubts. I'd have some questions for you :)
>
> My understanding is that each process/thread block initializes an
> event_pollset containing initially all the listening sockets (from Listen)
> and then later on all the ones related to keep alive / lingering close /
> etc.. sockets re-assigned to the listener by workers. Each process/threads
> block handles separate sockets except the listening ones that are "shared"
> (my understanding).
>
> Before sending the email I took a look to
> http://man7.org/linux/man-pages/man7/epoll.7.html and Q2/A2 (Q&A) states
> that the same socket "monitored" by different pollsets (or epoll instances,
> depending on the nomenclature) will get reported in each of them once an
> event is ready. EPOLLONESHOT seemed the only flag for epoll_ctl to use to
> avoid multiple threads/processes waking up at the same time, but I didn't
> find any trace of it in apr/httpd.
>
> So I am still super confused about how multiple listener threads
> (belonging to different processes and pollsets) won't be woken up at the
> same time by epoll_wait when a new connection lands to httpd. The
> explanation that I gave to myself was that with non blocking sockets and
> very few listeners the overhead of getting all of them to (try to) accept
> the same connection is not that heavy and could be acceptable performance
> wise (a simple EAGAIN returned by accept is not a big deal).
>
> I know that my understanding about epoll/httpd is really wrong but still
> not super convinced about where. If you still have patience (and time),
> would you mind to point me to a snippet of code that could solve my doubts?
> I am asking tons of questions because I'd like to write the most precise
> info in the docs without risking to confuse more readers like me (for
> example in
> http://httpd.apache.org/docs/current/misc/perf-tuning.html#runtime).
>
>
Tried to make some experiments with latest httpd 2.4.x code to better
understand the problem. I started with adding basic logging around the
accept() part of the event's listener thread:

Trivial patch, probably horribly written: http://apaste.info/rL5

Basic httpd.conf to start 6 processes with event, two listening port (80
and 8080). Tried to use curl and made request to localhost obtaining only
single accept attempts all the times, like:

[mpm_event:info] [pid 5975:tid 139917770266368] PT_ACCEPT after epoll_wait
[mpm_event:info] [pid 5975:tid 139917770266368] Accepting..
[mpm_event:info] [pid 5975:tid 139917770266368] Accept went fine!

I expected to see also failed accept attempt to validate my theory, but no
luck. So I started to strace the httpd processes (Linux, Debian) to get
more info:

strace -f $(pgrep httpd | sed -e 's/^/-p/g')

I saw regular calls to epoll_wait from all the listener threads as
expected, and I tried to make a HTTP request obtaining something
interesting:

http://apaste.info/oox

The only way that I can explain this behavior is due to strace "violent"
behavior while tracing a program, namely stopping it each time a system
call is invoked to log the event somewhere (and hence slowing everything
down a lot). Probably on busy servers different listeners are not in sync
while waiting for apr_pollset_poll events, so the multiple wake up issue is
not really a concern.

My point being: this is not a bad behavior but a very awesome trade-off to
avoid any accept mutex/serialization. It might be a very good information
for users to know, especially when comparing httpd with other solutions
(httpd is awesome). So if you like the idea, and if my understanding is
correct, I'll update the docs. Otherwise you are free to ban me from this
email list if you wish :)

Thanks again for the patience!

Luca

Mime
View raw message