apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Trawick <traw...@attglobal.net>
Subject Re: apr_proc_mutex is broken
Date Tue, 19 Nov 2002 01:20:11 GMT
Philip Martin <philip@codematters.co.uk> writes:

> I'm using the following patch to the testsuite.  It increases the
> amount of contention for the shared data, and causes the tests to fail
> reliably on my SMP workstation, using ReiserFS, ext2 and tmpfs, as
> well as on my non-SMP laptop using ext3.  Aaron is having problems
> reproducing the bug, perhaps someone else could try.
> 
> Index: test/testglobalmutex.c
> ===================================================================
> RCS file: /home/cvspublic/apr/test/testglobalmutex.c,v
> retrieving revision 1.3
> diff -u -r1.3 testglobalmutex.c
> --- test/testglobalmutex.c	9 Apr 2002 06:45:06 -0000	1.3
> +++ test/testglobalmutex.c	19 Nov 2002 00:48:46 -0000
> @@ -65,7 +65,7 @@
>  #include "test_apr.h"
>  
>  
> -#define MAX_ITER 4000
> +#define MAX_ITER 40
>  #define MAX_COUNTER (MAX_ITER * 4)
>  
>  apr_global_mutex_t *global_lock;
> @@ -82,13 +82,16 @@
>  
>      if (apr_proc_fork(*proc, p) == APR_INCHILD) {
>          while (1) {
> +            int x_t;
>              apr_global_mutex_lock(global_lock); 
>              if (i == MAX_ITER) {
>                  apr_global_mutex_unlock(global_lock); 
>                  exit(1);
>              }
>              i++;
> -            (*x)++;
> +            x_t = *x;
> +            apr_sleep(1);
> +            *x = x_t + 1;
>              apr_global_mutex_unlock(global_lock); 
>          }
>          exit(1);
> Index: test/testprocmutex.c
> ===================================================================
> RCS file: /home/cvspublic/apr/test/testprocmutex.c,v
> retrieving revision 1.10
> diff -u -r1.10 testprocmutex.c
> --- test/testprocmutex.c	9 Apr 2002 06:45:06 -0000	1.10
> +++ test/testprocmutex.c	19 Nov 2002 00:48:46 -0000
> @@ -65,7 +65,7 @@
>  #include "test_apr.h"
>  
>  
> -#define MAX_ITER 4000
> +#define MAX_ITER 40
>  #define MAX_COUNTER (MAX_ITER * 4)
>  
>  apr_proc_mutex_t *proc_lock;
> @@ -82,13 +82,16 @@
>  
>      if (apr_proc_fork(*proc, p) == APR_INCHILD) {
>          while (1) {
> +            int x_t;
>              apr_proc_mutex_lock(proc_lock); 
>              if (i == MAX_ITER) {
>                  apr_proc_mutex_unlock(proc_lock); 
>                  exit(1);
>              }
>              i++;
> -            (*x)++;
> +            x_t = *x;
> +            apr_sleep(1);
> +            *x = x_t + 1;
>              apr_proc_mutex_unlock(proc_lock); 
>          }
>          exit(1);
> 
> 
> Disabling the pool cleanup with this patch causes the tests to pass.
> This is obviously not a solution to the problem, it's simply a
> demonstration of the cause of the problem.

I think the problem is in the test programs (e.g., testprocmutex).  As
soon as one child hits the specified number of iterations, the child
will exit and something like this will happen:

  #0  proc_mutex_sysv_cleanup (mutex_=0x804e608) at proc_mutex.c:205
  #1  0x4002dfa1 in run_cleanups (c=0x804e660) at apr_pools.c:1973
  #2  0x4002d59e in apr_pool_destroy (pool=0x804e4f8) at apr_pools.c:755
  #3  0x4002d58a in apr_pool_destroy (pool=0x804a4e8) at apr_pools.c:752
  #4  0x4002d24a in apr_pool_terminate () at apr_pools.c:585
  #5  0x4002a45d in apr_terminate () at start.c:117

And of course as soon as semctl(IPC_RMID) is done, the lock is broken.

Here is IMHO what you should try:

1)  restore proper cleanup code
2)  tweak make_child() in the test program to trap errors from
    apr_proc_mutex_lock() and apr_proc_mutex_unlock() 

I bet you are hitting errors with the mutex, but they're not being
reported right now.  And of course if the mutex doesn't work right the
counter in shared memory is no longer interesting either.

-- 
Jeff Trawick | trawick@attglobal.net
Born in Roswell... married an alien...

Mime
View raw message