perl-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fred Moyer <f...@redhotpenguin.com>
Subject Re: [PATCH 2.0.5 futex] Fix to Children stuck on futex problem
Date Tue, 06 Mar 2012 17:40:36 GMT
Hi Salusa,

Would you or Max be able to construct a unit test that demonstrates
this failure condition, and then success once the patch is applied?
There should be some example tests in the t/ directory which you can
draw on for inspiration.

On Sun, Mar 4, 2012 at 10:29 AM, SalusaSecondus <salusa@nationstates.net> wrote:
> (Patch and system details at bottom)
>
> Hi all. I've root-caused and written a patch for the children stuck on
> futex problem described by both Sean Thorne in 2009 and Max Barry (who I
> work with) in 2011.
>
> The core of the problem is that modperl_tipool_putback_base only
> broadcasts that there are more interpreters available when there were no
> available interpreters prior to this putback. While this makes sense, it
> can create a problem.
>
> Notation:
> A: Acquire an interpreter
> P: Putback an interpreter
> B: Broadcast a free intepreter (really a signal)
> W: Wait on condition tipool->available (for free interpreter)
> (x,y): x is number of free interpreters at this point. y is the number
> in use.
> The number at the beginning of a line is the thread number
> Each line occurs within a single critical section (on mutex tipool->tiplock)
>
> Expected behavior:
> 4 threads, 2 free interpreters
> 1: A (1,1)
> 2: A (2,0)
> 3: W
> 4: W
> 1: P (1,1) B
> 3: A (2,0)
> 2: P (1,1) B
> 4: A (2,0)
> 3: P (1,1) B
> 4: P (0,2) <-- No broadcast because there was an available interpreter
> prior to this putback.
>
> Broken behavior:
> 4 threads, 2 free interpreters
> 1: A (1,1)
> 2: A (2,0)
> 3: W
> 4: W
> 1: P (1,1) B
> 2: P (0,2) <-- No broadcast because there was an available interpreter
> prior to this putback.
> 3: A (1,1)
> 3: P (0,2) <-- No broadcast because there was an available interpreter
> prior to this putback.
> (Broken)
>
> Thread 4 will never be signaled to pick up an interpreter. This results
> in the thread getting stuck on futex because sooner or later, apache
> will tell this worker to die (due to MaxRequestsPerChild). So, the
> parent thread will wait on the child threads joining, but one or more
> child threads will never wake up due to this problem.
>
> My proposed fix is to always broadcast the availability of an
> interpreter, regardless of whether there were already any free. This
> change passes all tests that I have found to throw at it as well as no
> longer deadlocking when reproducing the problem according to Max's
> instructions (http://pastebin.com/YDbmq84w).
>
> My System Details:
> uname -a: Linux modperl 2.6.38-8-server #42-Ubuntu SMP Mon Apr 11
> 03:49:04 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
> Apache: Custom build of 2.2.20 with ubuntu patches
> (http://packages.ubuntu.com/source/oneiric/apache2)
> modperl: Custom build of 2.0.5 with ubuntu patches
> (http://packages.ubuntu.com/source/oneiric/libapache2-mod-perl2)
> Build process: Standard ubuntu build process with following flags set:
> DEB_BUILD_OPTIONS="nostrip parallel=2 debug"
> CFLAGS="-g -O2 -DMP_TRACE=1 -DPERL_DESTRUCT_LEVEL=2 -DMP_DEBUG=1
> -UMP_USE_GTOP -I/usr/include/libgtop-2.0/ -I/usr/include/glib-2.0/
> -I/usr/lib/x86_64-linux-gnu/glib-2.0/include/"
>
> Patch:
> --- src/modules/perl/modperl_tipool.c.old       2012-03-03
> 19:43:57.112152297 -0800
> +++ src/modules/perl/modperl_tipool.c   2012-03-03 04:28:31.000000000 -0800
> @@ -328,9 +328,9 @@
>     MP_TRACE_i(MP_FUNC, "0x%lx now available (%d in use, %d running)",
>                (unsigned long)listp->data, tipool->in_use, tipool->size);
>
> +    modperl_tipool_broadcast(tipool);
>     if (tipool->in_use == (tipool->cfg->max - 1)) {
>         /* hurry up, another thread may be blocking */
> -        modperl_tipool_broadcast(tipool);
>         modperl_tipool_unlock(tipool);
>         return;
>     }
>
>
> Please let me know how best to get this checked in and out. As you might
> imagine, this futex problem has been causing us quite a few headaches :-)
>
> Greg Rubin
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
> For additional commands, e-mail: dev-help@perl.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@perl.apache.org
For additional commands, e-mail: dev-help@perl.apache.org


Mime
View raw message