From "Travis Vitek" <>
Subject RE: [PATCH] Use __rw_atomic_xxx() on Windows
Date Thu, 06 Sep 2007 01:46:26 GMT

Since we don't have a string perf test that I could find, I wrote up a
quick and dirty one that just made many copies of the same string
repeatedly to exercise the atomic increment/decrement. The results show
a 3% performance penalty when using the newer atomic functions. This
test was run with an 8d configuration, so the atomic functions were
compiled into the stdcxx dll. The test hardware is a Lenovo T60p [Intel
Core 2 T7600 2.33GHz CPU, 2GB RAM].

  Old                new [patched]
  ------  1 threads  ------  1 threads
  ms            714  ms            737
  ms/op  0.00004256  ms/op  0.00004393
  ------  2 threads  ------  2 threads
  ms           3911  ms           4024
  ms/op  0.00023311  ms/op  0.00023985
  ------  4 threads  ------  4 threads
  ms           7660  ms           7865
  ms/op  0.00045657  ms/op  0.00046879
  ------  8 threads  ------  8 threads
  ms          15192  ms          15585
  ms/op  0.00090551  ms/op  0.00092894

I'm wondering if we used inline assembly for the __rw_atomic_* functions
if the cost would be reduced. We could also evaluate the intrinsic
pragma that is available on MSVC.


>-----Original Message-----
>I will do a quick run using the string performance test after lunch.
>I'll report the results on that later. I've pasted the source for the
>bulk of my test below. If someone wants the entire thing, let me know
>and I'll provide everything.

