apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philip Martin <phi...@codematters.co.uk>
Subject apr_proc_mutex is broken
Date Sat, 16 Nov 2002 23:36:20 GMT
Hello

The following issue was raised in the Subversion issue tracker

http://subversion.tigris.org/issues/show_bug.cgi?id=996

"While tracking down Bug 995, I've found out that apr's testglobalmutex
 and testprocmutex are failing for me when running in SMP mode on a
 Dual P3.  The same tests do not fail when running in Single Processor
 mode or running on my Dual Athlon in SMP mode.  I am not certain if
 this related to Bug 995, but it does appear that httpd is going into a
 deadlock and the subversion cmdline client eventually times out.  I've
 also noticed that apache leaks semaphores when this happens (from
 ipcs)."

I'm not really sure why the submitter chose to report it to
Subversion, and not (as far as I can tell) to APR, but I took a look
at the proc_mutex code and there does appear to be a bug.

The documentation for apr_proc_mutex_create states that a proc mutex
is intended to synchronize processes.  Consider

APR_DECLARE(apr_status_t) apr_proc_mutex_lock(apr_proc_mutex_t *mutex)
{
    apr_status_t rv;

#if APR_HAS_THREADS
    if (apr_os_thread_equal(mutex->owner, apr_os_thread_current())) {
        mutex->owner_ref++;
        return APR_SUCCESS;
    }
#endif

    if ((rv = mutex->meth->acquire(mutex)) != APR_SUCCESS) {
        return rv;
    }

#if APR_HAS_THREADS
    mutex->owner = apr_os_thread_current();
    mutex->owner_ref = 1;
#endif

    return APR_SUCCESS;
}

The first problem is the line
    if (apr_os_thread_equal(mutex->owner, apr_os_thread_current())) {
where there is access to the shared data mutex->owner without any sort
of synchronization.  Now mutex->owner may not be an atomic type, in
which case a totally bogus value could be obtained, and if it is an
atomic type the unsynchronized access is still pointless.

However the real problem is that this is supposed to be a *process*
lock, and yet it is comparing *thread* IDs.  Thread IDs are distinct
within a process, but there is no guarantee that they are distinct
across mutiple processes.  Comparing thread IDs from two separate
processes is undefined behaviour when using POSIX threads.  There is
at least one common platform (GNU glibc threads) where thread IDs are
duplicated.  When I run APR's testprocmutex test on a 2-way SMP Linux
box it regularly fails (that it doesn't always fail is, I suspect,
because it is not a particularly good test and so the processes often
complete without any mutex contention).

-- 
Philip Martin

Mime
View raw message