At 10:03 PM 7/10/2002, Brian Pane wrote:
>Bill Stoddard wrote:
>
>>I've not looked at the generated code, but profiling indicates that an
>>additional division is happening, adding an extra 231 instructions.
>>(xlc_r -O2)
>
>If you redefine the macro as a shift, does the profile look better?
It isn't unbelievable.
Consider that constructing or devolving a -true- usec value become
more expensive [instruction-wise] because they need to be factored
out modulos (e.g. time & (2^20 - 1)) and then multiplied by usecs,
then divided by busecs.
The question is CPU cycles. Shifts and ANDs are cheap. Huge
integer division is not. At least factoring out a usec from an apr_time
is only a huge integer multiplication.
Of course factoring a usec into a busec is just as expensive as it
once was, since we multiply (shift) by busec/sec, then divide the
huge value by usec/sec. That is the one most expensive operation
in the entire schema.
BTW - if you are looking at NT numbers, Bill, keep in mind we have
additional optimizations due to the fact that we deal in 100ns units
and milliseconds on Win32, which can all be cleaned up for busec
optimized calculations.
Don't forget to optimize both the busec/sec division and multiplication
to shifts, and the modulos operator to &(2^20-1).
Bill