while testing our recent httpd metadata patches to mod_deflate.c, we encountered an infinite loop in the new code on AIX.  the loop with the changed code amounts to:

while (!APR_BRIGADE_EMPTY(bb)) {
     e = APR_BRIGADE_FIRST(bb);
     if (blah) {

the loop happens with our normal xlc -O2 optimization.  turning off optimization makes the problem go away.  when walking thru the optimized code with dbx, it looks like neither the APR_BRIGADE_EMPTY nor the APR_BRIGADE_FIRST are accessing the bb head pointers in memory after the loop is initialized.  at first I was convinced this was an xlc optimizer bug, but Googling did not reveal any known problems like this.

then I started wondering how a compiler would know that the bb head structure has changed.  there is nothing in the expansion of APR_BUCKET_REMOVE that provides a clue that this could update the brigade (a.k.a. ring head).  if it hasn't changed, the optimizer might think it is safe to move the bb head memory accesses out of the loop.  sure enough, the following patch makes the problem go away:

   --- apr_ring.h         29 Jul 2003 20:19:26 -0000         1.1
   +++ apr_ring.h         15 Aug 2007 14:50:00 -0000         1.2
   @@ -128,8 +128,8 @@
    #define APR_RING_HEAD(head, elem)                                             \
        struct head {                                                               \
   -         struct elem *next;                                                      \
   -         struct elem *prev;                                                      \
   +         struct elem * volatile next;                                             \
   +         struct elem * volatile prev;                                             \

why haven't we seen this before?  actually, I believe we have.  Jeff T. told me off list that he saw a problem with xlc -O2 in httpd's server/core.c.  he bypassed it by inserting a no-op function call after the BRIGADE_NORMALIZE containing an APR_BUCKET_REMOVE macro.  the function call would also warn an optimizer to be conservative and don't trust previously fetched pointers.  looking at the while loop in mod_deflate, the other code paths prior to the metadata patches always involve a function call.  the bucket insertion macros explicitly reference the head structure, so no problem there.  apr_bucket_delete generates a call to the destroy() function, so it is safe.

comments?  is there a better way to insure that an optimizing compiler knows what's up?