On 10/04/12 22:41, Liviu Nicoara wrote: > On 10/4/12 10:10 PM, Liviu Nicoara wrote: >> >> On 10/3/12 11:10 AM, Martin Sebor wrote: >>> On 10/03/2012 07:01 AM, Liviu Nicoara wrote: >>>> >>> void* thread_func (void*) { >>> for (int i = 0; i < N; ++) >>> test 1: do some simple stuff inline >>> test 2: call a virtual function to do the same stuff >>> test 3: lock and unlock a mutex and do the same stuff >>> } >>> >>> Test 1 should be the fastest and test 3 the slowest. This should >>> hold regardless of what "simple stuff" is (eventually, even when >>> it's getting numpunct::grouping() data). >> >> That is expected; I attached test case x.cpp and results-x.txt. >> >> I did not find it too interesting in its own, though. The difference >> between the >> cached and non-cached data is that in the case of the cached data the >> copying of >> the string involves nothing more than a bump in the reference counter, >> whereas >> in the non-cached version a string object is constructed anew, and >> memory gets >> allocated for its body. Yet, in my measurements, the cached version is >> the one >> which shows the worse performance. >> >> So, I extracted the std::string class and simplified it down and put >> it in >> another test case. That would be u.cpp and the results are >> results-u.txt. The >> results show the same performance trends although the absolute values >> have >> skewed. Will get back to this after I digest the results a bit more. This discussion has stagnated lately. I hope to allocate some time this week-end to go over the results, re-verify them and analyze the disassembly. In the meantime, any thoughts? Liviu