apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philip Martin <phi...@codematters.co.uk>
Subject Re: apr_proc_mutex is broken
Date Sun, 17 Nov 2002 12:25:46 GMT
Philip Martin <philip@codematters.co.uk> writes:

> The first problem is the line
>     if (apr_os_thread_equal(mutex->owner, apr_os_thread_current())) {
> where there is access to the shared data mutex->owner without any sort
> of synchronization.  Now mutex->owner may not be an atomic type, in
> which case a totally bogus value could be obtained, and if it is an
> atomic type the unsynchronized access is still pointless.
> However the real problem is that this is supposed to be a *process*
> lock, and yet it is comparing *thread* IDs.  Thread IDs are distinct
> within a process, but there is no guarantee that they are distinct
> across mutiple processes.  Comparing thread IDs from two separate
> processes is undefined behaviour when using POSIX threads.  There is
> at least one common platform (GNU glibc threads) where thread IDs are
> duplicated.  When I run APR's testprocmutex test on a 2-way SMP Linux
> box it regularly fails (that it doesn't always fail is, I suspect,
> because it is not a particularly good test and so the processes often
> complete without any mutex contention).

I've looked at the proc mutex code again, and things are less clear.

I see now that apr_proc_mutex_create and apr_proc_mutex_unlock both
set the mutex->owner field to zero to indicate an invalid mutex.  This
is not valid for a POSIX thread system because a) zero may be a valid
thread ID, and b) passing an "invented" thread ID to pthread_equal is
undefined behaviour.

However it may well work on a Linux glibc 2.2.5 system where I believe
pthread_t is an unsigned long and zero is not used as a thread ID.  It
also means that my complaint about comparing thread IDs from different
processes does not apply (I was assuming mutex->owner was initialized
to the thread ID of the thread that created the proc mutex, as that is
the only available valid thread ID).

Despite the problems with the proc mutex code I cannot identify one
which would cause the test failure I am seeing.  I'm running the test
on an SMP (dual P3) Linux machine and it fails about one run in three.
Here is the output of a typical failure

  $ ./testprocmutex 
  APR Proc Mutex Test

  Exclusive lock test
      Initializing the lock                                   OK
      Starting all of the processes                           OK
      Waiting for processes to exit                           OK
  Locks don't appear to work!  x = 15998 instead of 16000

The test is multi-process, but the processes are single threaded.  I
guess the problem lies somewhere in the semaphores, but I don't have
any experience of using those.

Philip Martin

View raw message