At 06:37 AM 4/27/2007 0500, William A. Rowe wrote:
>Lucian Adrian Grijincu wrote:
>> in aprconv10faster.patch you added:
>>
>> static const char digits[] = "0123456789";
>> *p = digits[magnitude % 10];
>>
>> Why is this faster than:
>> *p = (char) '0' + (magnitude % 10); ?
[snip]
>> Am I missing something here?
>
>nope  the proposed change is a bit more expensive. (magnitude % 10 in
>any case being the unavoidably most expensive bit.)
>
>The only justification would be a code page where the digits aren't
sequential
>characters, but there is no such thing. Note Davi's approach is sensible for
>hex, and for alpha mappings which are subject to oddities such as ebcdic.
Some thoughts:
1. For hexadecimal, what about something like:
int digit = magnitude & 15;
*p = '0' + (digit >= 10) * ('A'  '0'  10) + digit;
On architectures like IA64, substantial parallelism may be gained by such
an expression and with no data loads:
; r1 == magnitude
; result in r3
{
and r2 = 15, r1
} ;;
{
cmp.ge p1 = 10, r2
add r3 = '0', r2
} ;;
{
(p1) add r3 = r3, 7 ; 'A'  '0'  10
}
Even on x86, it should still be pretty fast:
; eax = magnitude
; result in ebx
mov ecx, eax
and ecx, 15
mov ebx, ecx
sub ecx, 10
setge cl
neg cl
and cl, 7
add ebx, ecx
I don't know if today's compilers approach that level of brevity, though...
2. For decimal, what about a table of 100 apr_int16_t values storing two
digits each? It would be more data to cache, but it would allow the loop to
emit two digits per iteration:
(statically)
short *two_digit_lut = (short *)
"0001020304050607080910111213141516171819"
"2021222324252627282930313233343536373839"
"4041424344454647484950515253545556575859"
"6061626363465667686970717273747576777879"
"8081828384858687888990919293949596979899";
(in the loop)
if ((int)p & 1 == 0) {
/* Ensure alignment because some platforms will complain */
*p = '0' + (magnitude % 10);
magnitude /= 10;
}
/* Eat two digits at a time.
while (magnitude > 9) {
*(short *)p = two_digit_lut[magnitude % 100];
p, magnitude /= 100;
}
/* Store the last digit, if there is one.
if (magnitude)
*p = '0' + magnitude;
Could these yield improvements over the existing approaches?
Jonathan Gilbert
