At 01:11 AM 4/28/2007 -0500, I wrote: >At 06:37 AM 4/27/2007 -0500, William A. Rowe wrote: >>nope - the proposed change is a bit more expensive. (magnitude % 10 in >>any case being the unavoidably most expensive bit.) [snip] > /* Eat two digits at a time. > while (magnitude > 9) { > *(short *)--p = two_digit_lut[magnitude % 100]; > --p, magnitude /= 100; > } [snip] Incidentally, I fixed up this loop (it should be "*(short *)(p -= 2)", rather than splitting the subtraction like that, and the alignment comparison should be inverted) and ran a little test, and apparently Microsoft's compiler does not use an IDIV. Instead, it uses a bizarre multiplication trick to obtain the values of "magnitude % 100" and "magnitude / 100": ; magnitude in ecx mov eax, 1374389535 imul ecx sar edx, 5 mov eax, edx shr eax, 31 add eax, edx ; eax is now equal to ecx / 100! mov edx, eax imul edx, 100 sub ecx, edx ; ecx is now equal to magnitude % 100 ; (magnitude - 100 * (magnitude / 100)) Intuitively, I wouldn't expect this long sequence to be faster, but it must be or they wouldn't emit it. (I guess I'm underestimating the cost of a division significantly!) They apparently use it whenever they see both "value % n" and "value / n" near to one another. Also note how in this case, since it is in a loop where the LCV is overwritten, their register allocator has no qualms about overwriting the original magnitude value in ecx, since it already has (magnitude / 100) for the next loop in eax. I'm not sure how the magic number 1374389535 is computed, but I'm sure it's not rocket science once you know the trick. If this is significantly faster than an IDIV or two, then other parts of the loop become more significant. Jonathan Gilbert