2009/6/18 Gavin Sherry <swm@alcove.com.au>
2009/6/18 Bojan Smojver <bojan@rexursive.com>
On Thu, 2009-06-18 at 13:22 +0200, Gavin Sherry wrote:

> Do, we're trying to store r0 (int 6144) at 4(r5) = 4 + 844(r2) above.
> Seems like this should all work fine... remarkably trivial stuff. I've
> even tried two different versions of GCC with the same result. :(

So, if you stop and try to examine that particular address, does it
crash too?

(My message has been getting caught by your spam filter. Must be all the references to registers and so on. Hopefully more meaningful text here will make that stop).


This is quite weird.

(gdb) print &proc_mutex_op_try       
$4 = (struct sembuf *) 0x20001f00
(gdb) set proc_mutex_op_try.sem_flg = 1111
(gdb) print proc_mutex_op_try
$5 = {sem_num = 0, sem_op = 0, sem_flg = 1111}
(gdb) info registers
r0             0x1800   6144
r1             0x2ff228b0       804399280
r2             0x200018c8       536877256
r3             0xffffffff       -1
r4             0x20001f48       536878920
r5             0x1001e828       268560424 <-----
r6             0x20000ab0       536873648
r7             0x1001e84c       268560460
r8             0x1001e840       268560448
r9             0x0      0
r10            0x0      0
r11            0x1000   4096
r12            0x48244280       1210335872
r13            0xdeadbeef       -559038737
r14            0x1      1
r15            0x2ff22a04       804399620
r16            0x2ff22a0c       804399628
r17            0x0      0
r18            0xdeadbeef       -559038737
r19            0xdeadbeef       -559038737
r20            0xdeadbeef       -559038737
r21            0xdeadbeef       -559038737
r22            0xdeadbeef       -559038737
r23            0xdeadbeef       -559038737
r24            0xdeadbeef       -559038737
r25            0xdeadbeef       -559038737
r26            0xdeadbeef       -559038737
r27            0xb      11
r28            0x20000900       536873216
r29            0x10000000       268435456
r30            0x2ff22948       804399432
r31            0x0      0
pc             0x10005e74       268459636
ps             0x2d0b2  184498
cr             0x44244242       1143226946
lr             0x100079b4       268466612
ctr            0x0      0
xer            0x8      8
fpscr          0x0      0
vscr           0x0      0
vrsave         0x0      0
(gdb) ni


Program received signal SIGSEGV, Segmentation fault.
apr_proc_mutex_unix_setup_lock () at locks/unix/proc_mutex.c:176
176         proc_mutex_op_try.sem_flg = SEM_UNDO | IPC_NOWAIT;
1: x/i $pc  0x10005e74 <apr_proc_mutex_unix_setup_lock+44>:     sth     r0,4(r5)

 So, when we try and put r0 in memory at r5 + 4, we're trying to overwrite a part of memory where proc_mutex_op_try is not actually defined.

I might try current GCC and see if that helps. Alternatively, I've seen some other pre built APR binary releases for AIX. I'll disassemble the relevant function and see what's going on there.

Other than that, does anyone else on the list have access to AIX? Can the problem be recreated? The hardware on the machine itself seems sound as I've compiled several other large pieces of code which do not exhibit this kind of peculiar behaviour.

I reverted to an older compiler (gcc 4.0.0, shipped by IBM) and the tests pass. So, this seems to be a compiler bug for this platform. Thanks for your help.

Gavin