apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Colin Hirsch <share-apa...@think42.com>
Subject Re: Atomic Operations Assembly
Date Wed, 13 Sep 2006 16:28:10 GMT
On Tue, Sep 12, 2006 at 09:48:00AM -0400, Garrett Rooney wrote:
> On 9/8/06, Colin <share-apache@think42.com> wrote:
> >
> >for one of my projects I was recently looking for implementations of some
> >atomic primitives for Sparc and PowerPC, and also stumbled over the code
> >in apr_atomic.c.
> >
> >A few weeks later I am returning here with a bug report and an
> >implementation for SPARC.
> >
> >1) PowerPC Bug
> >
> >The Bug is actually in the Chip, not in the APR code; as stated in erratum
> >number 12 in
> >
> >http://www-306.ibm.com/chips/techlib/techlib.nsf/techdocs/79B6E24422AA101287256E93006C957E/$file/PowerPC_970FX_errata_DD3.X_V1.7.pdf
> >
> >one must not spin in a three-instruction lwarx/ldarx loop because there is
> >obviously at least one PowerPC chip that can enter a livelock...
> >
> >(Unfortunately I don't have a solution for this one; I am on Mac OS X
> >where the OS supplies the spinlock primitive that I actually need; the Mac
> >OS X version is documented as using exponential back-off.)

I should add that the bug is very real: I have already successfully
deadlocked my PowerMac Quad G5 when using a similar loop to implement
a spinlock...

> >2) Sparc/GCC Code
>
> If you could send this as an actual patch that'll apply to the current
> code, it's FAR more likely to actually be committed.

Sorry, of course ... below is a patch for atomic/unix/apr_atomic.c
which is the only file affected.

The following remarks are important with respect to considering the
patch for integration, and also with respect to other bugs (!) in
apr_atomic.c that I noticed while looking at the code:

- The patch below and all remarks are relative to APR release 1.2.7.

- The "cas" Sparc assembler command is available only on v9 and later,
which basically means that it only runs on UltraSparc and Niagara, and
must be compiled with -mcpu=v9.

- I have configured, built and "make test"ed on an 8-CPU Solaris 8 
machine with GCC 3.3 and GCC 4.0.2, using GNU as and ld, and all seems
to work...

(This required an extra CFLAGS="-mcpu=v9" for invoking configure to
get enable the v9 assembly; I had some strange problems with GCC
2.95.3 that I couldn't resolve within 10 minutes.)

- There are some places where code in apr_atomic.c "falls into the 
volatile trap", i.e. uses volatile as only means of ensuring memory
consistency, for example the final function in apr_atomic.c

#if !defined(APR_OVERRIDE_ATOMIC_READ32)
APR_DECLARE(apr_uint32_t) apr_atomic_read32(volatile apr_uint32_t *mem)
{
    return *mem;
}

is _not_ sufficient on most non-x86 architectures with more than one
CPU since volatile does _not_ enforce actual synchronisation of the
value across CPUs. For a correct version for Sparc see my attached
patch which includes a memory barrier instruction to force the read
to deliver the correct, up-to-date value. A corrected version reads

APR_DECLARE(apr_uint32_t) apr_atomic_read32(apr_uint32_t *mem)
{
    appropriate_memory_barrier();
    return *(volatile apr_uint32_t *)mem;
}

(The subject of volatile has created heated discussions on some other
mailing lists, see for example recent discussions on LKML, and there
in particular Linus Torvalds' remarks on the subject which give a very
good overview on why most uses of volatile are wrong/superfluous/hide
bugs; I am also willing to discuss this subject more in order to help
understand and debug this and other remaining issues.)

Regards, Colin


colin# diff apr_atomic.c.old apr_atomic.c
164a165,245
> #if (defined(__SPARC__) || defined(__sparc__)) && defined(__GNUC__ ) \
>    && !defined(USE_GENERIC_ATOMICS)
> 
> APR_DECLARE(apr_uint32_t) apr_atomic_cas32(volatile apr_uint32_t * mem,
>                                            apr_uint32_t with,
>                                            apr_uint32_t cmp)
> {
>    __asm__ __volatile__ ( "cas [%1],%2,%0"
>                           : "+r" ( with )
>                           : "r" ( mem ), "r" ( cmp ) );
>    return with;
> }
> #define APR_OVERRIDE_ATOMIC_CAS32
> 
> APR_DECLARE(apr_uint32_t) apr_atomic_read32(volatile apr_uint32_t * mem)
> {
>    __asm__ __volatile__ ( "membar #StoreLoad | #LoadLoad" : : : "memory" );
>    return *(volatile apr_uint32_t *)mem;
> }
> #define APR_OVERRIDE_ATOMIC_READ32
> 
> APR_DECLARE(void) apr_atomic_set32(volatile apr_uint32_t * mem,
>                                    apr_uint32_t val)
> {
>    *(volatile apr_uint32_t *)mem = val;
>    __asm__ __volatile__ ( "membar #StoreStore | #StoreLoad" : : : "memory" );
> }
> #define APR_OVERRIDE_ATOMIC_SET32
> 
> APR_DECLARE(apr_uint32_t) apr_atomic_add32(volatile apr_uint32_t * mem,
>                                            apr_uint32_t val)
> {
>    apr_uint32_t nrv;
> 
>    do {
>       nrv = apr_atomic_read32(mem);
>    } while (apr_atomic_cas32(mem,nrv+val,nrv)!=nrv);
> 
>    return nrv;
> }
> #define APR_OVERRIDE_ATOMIC_ADD32
> 
> APR_DECLARE(void) apr_atomic_sub32(volatile apr_uint32_t * mem, apr_uint32_t val)
> {
>    apr_atomic_add32(mem,-val);
> }
> #define APR_OVERRIDE_ATOMIC_SUB32
> 
> APR_DECLARE(apr_uint32_t) apr_atomic_inc32(volatile apr_uint32_t * mem)
> {
>    return apr_atomic_add32(mem,1);
> }
> #define APR_OVERRIDE_ATOMIC_INC32
> 
> APR_DECLARE(int) apr_atomic_dec32(volatile apr_uint32_t * mem)
> {
>    apr_uint32_t nrv;
> 
>    do {
>       nrv = apr_atomic_read32(mem);
>    } while (apr_atomic_cas32(mem,nrv-1,nrv)!=nrv);
> 
>    return nrv != 1;
> }
> #define APR_OVERRIDE_ATOMIC_DEC32
> 
> APR_DECLARE(apr_uint32_t) apr_atomic_xchg32(volatile apr_uint32_t * mem,
>                                             apr_uint32_t val)
> {
>    apr_uint32_t nrv;
> 
>    do {
>       nrv = apr_atomic_read32(mem);
>    } while (apr_atomic_cas32(mem,val,nrv)!=nrv);
> 
>    return nrv;
> }
> #define APR_OVERRIDE_ATOMIC_XCHG32
> 
> #endif /* __SPARC__ && __GNUC__ */
> 

Mime
View raw message