apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philip Martin <phi...@codematters.co.uk>
Subject Re: [PATCH] Re: apr_proc_mutex is broken
Date Tue, 19 Nov 2002 14:59:47 GMT
Jeff Trawick <trawick@attglobal.net> writes:

> I contemplated relatively-complex exit sequences so that children
> didn't exit until the test was over, but life is short, and these test
> programs should be short too.
> 
> Besides adding crude error checking (crude better than none) for some
> critical APR calls, this patch ensures that apr_terminate() is not
> called in the child processes.

Huh?  I don't understand this.  The child process still destroys the
semaphore.  When a child exits there is nothing that guarantees that
the other children have finished.  It depends on the OS scheduler,
unless the children implement some explicit synchronisation.

Unsurprisingly, I get

$ ./testprocmutex  
APR Proc Mutex Test
==============

Exclusive lock test
    Initializing the lock                                   OK
    Starting all of the processes                           lt-testprocmutex: testprocmutex.c:93:
make_child: Assertion `apr_proc_mutex_unlock(proc_lock) == 0' failed.
lt-testprocmutex: testprocmutex.c:93: make_child: Assertion `apr_proc_mutex_unlock(proc_lock)
== 0' failed.
OK
lt-testprocmutex: testprocmutex.c:93: make_child: Assertion `apr_proc_mutex_unlock(proc_lock)
== 0' failed.
    Waiting for processes to exit                           OK
Locks don't appear to work!  x = 4003 instead of 16000


> In reality it would be nice to set up a proc mutex such that it is
> only destroyed up in the parent.  I suspect that this is reasonable
> default behavior, in which case no API change would be necessary.

It's more than "nice"!  Without some change to the current behaviour
it is not possible for processes to exit unless they implement some
sort of synchronisation.

> This could be implemented by storing a pid_t in the internal mutex
> representation and comparing that with getpid() in the cleanup before
> deciding whether to tell the kernel to destroy the mutex.  The exact
> details would vary between mutex types.  For file-based mutexes, there
> is no destroy syscall, and the existing logic to close the file still
> needs to be performed regardless of parent or child.

Of course the test should check return values, but that's not enough,
even if the pool cleanup is fixed.  The test still doesn't really test
that the proc_mutex implementation provides any mutex behaviour.
Given that the old test often passed even after the mutex had been
destroyed, it is likely that the test would pass with a proc_mutex
implementation consisting of stubs that return APR_SUCCESS, or on a
platform where the underlying semaphore is broken.  That's not much of
a test (or my expectations of the test suite are too high).

> Index: testprocmutex.c
> ===================================================================
> RCS file: /home/cvs/apr/test/testprocmutex.c,v
> retrieving revision 1.10
> diff -u -r1.10 testprocmutex.c
> --- testprocmutex.c	9 Apr 2002 06:45:06 -0000	1.10
> +++ testprocmutex.c	19 Nov 2002 13:28:10 -0000
> @@ -60,6 +60,7 @@
>  #include "apr_general.h"
>  #include "apr_getopt.h"
>  #include "errno.h"
> +#include <assert.h>
>  #include <stdio.h>
>  #include <stdlib.h>
>  #include "test_apr.h"
> @@ -70,7 +71,7 @@
>  
>  apr_proc_mutex_t *proc_lock;
>  apr_pool_t *pool;
> -int *x;
> +int *x, *y;
>  
>  static int make_child(apr_proc_t **proc, apr_pool_t *p)
>  {
> @@ -82,14 +83,14 @@
>  
>      if (apr_proc_fork(*proc, p) == APR_INCHILD) {
>          while (1) {
> -            apr_proc_mutex_lock(proc_lock); 
> +            assert(apr_proc_mutex_lock(proc_lock) == APR_SUCCESS);
>              if (i == MAX_ITER) {
> -                apr_proc_mutex_unlock(proc_lock); 
> +                assert(apr_proc_mutex_unlock(proc_lock) == APR_SUCCESS);
>                  exit(1);
>              }
>              i++;
>              (*x)++;

This line is the problem, (*x)++ is too fast to be a good test of the
mutex, it's simply too few instructions for there to be a much chance
of the OS scheduling another process (unless one happens to be using
an SMP machine).  The sequence "read, wait, increment, write" is a
much better test as the wait requires the other processes to block.
The test doesn't have to run the loop thousands of times, half a dozen
would do, but it does have to hold the mutex for a "significant"
length of time.

> -            apr_proc_mutex_unlock(proc_lock); 
> -            apr_proc_mutex_unlock(proc_lock); 
> +            assert(apr_proc_mutex_unlock(proc_lock) == APR_SUCCESS);
>          }
>          exit(1);
>      }

-- 
Philip Martin

Mime
View raw message