subversion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Fuhrmann <stefan.fuhrm...@wandisco.com>
Subject Re: 1.9.0 crash in ra_test (Solaris, Bus error, Alignment)
Date Thu, 06 Aug 2015 16:17:41 GMT
On Thu, Aug 6, 2015 at 4:54 PM, Rainer Jung <rainer.jung@kippdata.de> wrote:

> Am 06.08.2015 um 16:36 schrieb Rainer Jung:
>
>> Position of crash:
>>
>> #0  0xff29f194 in vparse_tuple (pool=pool@entry=0x35f10,
>> fmt=fmt@entry=0xffbff264, ap=ap@entry=0xffbff20c,
>>      items=<error reading variable: Unhandled dwarf expression opcode
>> 0xfa>, items=<error reading variable: Unhandled dwarf expression opcode
>> 0xfa>)
>>      at subversion/libsvn_ra_svn/marshal.c:1310
>> 1310            *va_arg(*ap, apr_uint64_t *) = elt->u.number;
>>
>> elt is:
>>
>> (gdb) print *elt
>> $2 = {kind = SVN_RA_SVN_NUMBER, u = {number = 2, string = 0x0, word =
>> 0x0, list = 0x0}}
>>
>> The addresses are:
>>
>>    elt address is: 0x7012c
>>    elt->u and elt->u.number addresses are both: 0x70134
>>
>> and the crash happens when elt->u.number is being accessed as an
>> apr_uint64_t under this address which is only 4 byte aligned.
>>
>> I haven't tracked down, where elt is actually being allocated. That
>> would be the place to make sure, it is 8 byte aligned. It should be
>> automatic if allocated using its type svn_ra_svn_item_t, but maybe it is
>> allocated in a more generic way with a type the compiler can not align
>> correctly for the later use as svn_ra_svn_item_t.
>>
>
> I think the root cause is here (file subversion/libsvn_ra_svn/marshal.c):
>
>   1082        /* Allocate an APR array with room for (initially) 4 items.
>   1083         * We do this manually because lists are the most frequent
> protocol
>   1084         * element, often used to frame a single, optional value.
> We save
>   1085         * about 20% of total protocol handling time. */
>   1086        char *buffer = apr_palloc(pool, sizeof(apr_array_header_t)
>   1087                                        + 4 *
> sizeof(svn_ra_svn_item_t));
>   1088        svn_ra_svn_item_t *data
>   1089          = (svn_ra_svn_item_t *)(buffer +
> sizeof(apr_array_header_t));
>   1090
>   1091        item->kind = SVN_RA_SVN_LIST;
>   1092        item->u.list = (apr_array_header_t *)buffer;
>
> "buffer" is not specifically aligned, the array members in "item->u.list =
> (apr_array_header_t *)buffer" could be misaligned.
>
> The following (ugly) workaround fixes it for me:
>
> --- subversion/libsvn_ra_svn/marshal.c.kpdt_orig        Fri Feb 13
> 12:17:40 2015
> +++ subversion/libsvn_ra_svn/marshal.c  Thu Aug  6 17:46:58 2015
> @@ -1083,10 +1083,16 @@
>         * We do this manually because lists are the most frequent protocol
>         * element, often used to frame a single, optional value.  We save
>         * about 20% of total protocol handling time. */
> -      char *buffer = apr_palloc(pool, sizeof(apr_array_header_t)
> +
> +      /* Make sure the data part of the buffer has appropriate alignment
> +         by prefixing it with a size that fits the needed
> apr_array_header_t
> +         but is itself highly aligned. */
> +      size_t offset = sizeof(apr_array_header_t) / 8 * 8;
> +
> +      char *buffer = apr_palloc(pool, offset
>                                        + 4 * sizeof(svn_ra_svn_item_t));
>        svn_ra_svn_item_t *data
> -        = (svn_ra_svn_item_t *)(buffer + sizeof(apr_array_header_t));
> +        = (svn_ra_svn_item_t *)(buffer + offset);
>
>        item->kind = SVN_RA_SVN_LIST;
>        item->u.list = (apr_array_header_t *)buffer;
>
> But of course its a bit rough, because it would apply on all platforms,
> even if not needed. Also on some (future?) platforms, the alignment for 8
> bytes might not always be correct.
>
> It's a bit tragic that this code part is prefixed with:
>
> * We do this manually because lists are the most frequent protocol
> * element, often used to frame a single, optional value.  We save
> * about 20% of total protocol handling time. */
>
> and the trap is that doing it manually often is harder than expected.
> Switching to apr_array_make() would have not introduced this bug, but of
> course you did it for a reason.
>
> Let me know, if I should test any other patch.
>

Hi Rainer,

Thanks for the report - it took me a while to figure out what the problem
is.

Apparently, the machine uses 32 bit pointers but reads 64 bit ints as atomic
values (i.e. using a native 64 bit read op) and requires that to be properly
aligned. I'll have to go through the code to check if there are other places
that imply "void * alignment is good enough".

The fix is in r1694533 and is scheduled for backport into 1.9.x.

-- Stefan^2.

Mime
View raw message