apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Gooderum <m...@verniernetworks.com>
Subject PThread cond_wait bug w/nested mutex
Date Thu, 21 Aug 2003 21:16:32 GMT
I've discovered what I think it a bug in the pthreads based cond_wait() 
code when used with nested mutexes.

The "proper" (per my experience) way to use a condition variable in the 
pthreads world is a squence like:

Waiter            Notifier
----------        -------------
mutex_lock
cond_wait
(pthreads atomically surrenders lock and waits)
                   mutex_lock
                   cond_signal
                   mutex_unlock
(pthread atomically requires the mutex and wakes up)
mutex_unlock

But if the waiter code is for instance in a library and is called with 
the nested mutex already locked the lock/cond_wait/unlock cycle always 
clears the mutex.  So a cycle like:

Waiter            Notifier
----------        -------------
mutex_lock
mutex_lock
cond_wait
(pthreads atomically surrenders lock and waits)
                   mutex_lock
                   cond_signal
                   mutex_unlock
(pthread atomically requires the mutex and wakes up)
mutex_unlock
mutex_unlock

is broken - Details follow - what happens is the mutex_lock/unlock cycle 
in the notifier trashes the APR portions of the mutex.

    APR_DECLARE(apr_status_t) apr_thread_cond_wait(apr_thread_cond_t *cond,
                                                   apr_thread_mutex_t
    *mutex)
    {
        apr_status_t rv;

        rv = pthread_cond_wait(cond->cond, &mutex->mutex);
    #ifdef PTHREAD_SETS_ERRNO
        if (rv) {
            rv = errno;
        }
    #endif
    }

The owner and owner_ref fields of the APR mutex structure are left 
alone.  The caller does a mutex_lock which works and destroys the 
owner/owner_ref fields:

    APR_DECLARE(apr_status_t) apr_thread_mutex_lock(apr_thread_mutex_t
    *mutex)
    {
        apr_status_t rv;

        if (mutex->nested) {
    ---->This doesn't hit because owner is the other waiting thread:
            if (apr_os_thread_equal(mutex->owner,
    apr_os_thread_current())) {  
                mutex->owner_ref++;
                return APR_SUCCESS;
            }
    --->So we succeed in "stealing" the mutex
            rv = pthread_mutex_lock(&mutex->mutex);
            if (rv) {
    #ifdef PTHREAD_SETS_ERRNO
                rv = errno;
    #endif
                return rv;
            }

            mutex->owner = apr_os_thread_current();
            mutex->owner_ref = 1;
            return rv;
        }
        else {
            rv = pthread_mutex_lock(&mutex->mutex);
    #ifdef PTHREAD_SETS_ERRNO
            if (rv) {
                rv = errno;
            }
    #endif
            return rv;
        }
    }

When the notifier than calls mutex_unlock() the pthread mutex is 
released and the mutex owner/owner_ref fields zeroed out.

So now the waiter thread gets the pthread mutex inside 
pthread_cond_wait() and wakes up.  The call returns to the caller.  The 
caller then trys to release the mutex - this call succeeds because we 
always release the mutex.  But the mutex is now fully released  - so if 
the caller is counting on holding the mutex with nesting he's hosed.

    APR_DECLARE(apr_status_t) apr_thread_mutex_unlock(apr_thread_mutex_t
    *mutex)
    {
        apr_status_t status;

        if (mutex->nested) {
            ---->This fails so it just releases the pthread_mutex
    (incorrectly if my original nest count was > 1)
            if (apr_os_thread_equal(mutex->owner,
    apr_os_thread_current())) {
                mutex->owner_ref--;
                if (mutex->owner_ref > 0)
                    return APR_SUCCESS;
            }
            status = pthread_mutex_unlock(&mutex->mutex);
            if (status) {
    #ifdef PTHREAD_SETS_ERRNO
                status = errno;
    #endif
                return status;
            }

            memset(&mutex->owner, 0, sizeof mutex->owner);
            mutex->owner_ref = 0;
            return status;
        }
        else {
            status = pthread_mutex_unlock(&mutex->mutex);
    #ifdef PTHREAD_SETS_ERRNO
            if (status) {
                status = errno;
            }
    #endif
            return status;
        }
    }

The fix seems to be to save and restore the owner/owner_ref fields 
around the pthread_cond_wait() call.  The same thing applies to the 
timed version of the wait but I used the shorter version here for 
clarity.  P.S.. - I'm not subscribed to the dev list so please CC me on 
replies, thanks.
--
Mark Gooderum
Vernier Networks, Inc.
mark@verniernetworks.com






 



Mime
View raw message