incubator-stdcxx-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Liviu Nicoara <nikko...@hates.ms>
Subject Re: Fwd: Re: STDCXX-1071 numpunct facet defect
Date Sun, 30 Sep 2012 21:30:22 GMT
On 9/30/12 2:21 PM, Liviu Nicoara wrote:
> Forwarding with the attachment.
>
> -------- Original Message --------
> Subject: Re: STDCXX-1071 numpunct facet defect
> Date: Sun, 30 Sep 2012 12:09:10 -0600
> From: Martin Sebor <msebor@gmail.com>
> To: Liviu Nicoara <nikkoara@hates.ms>
>
>> On 9/27/12 8:27 PM, Martin Sebor wrote:
>
> Here are my timings for library-reduction.cpp when compiled
> GCC 4.5.3 on Solaris 10 (4 SPARCV9 CPUs). I had to make a small
> number of trivial changes to get it to compile:
>
>           With cache   No cache
> real    1m38.332s     8m58.568s
> user    6m30.244s    34m25.942s
> sys     0m0.060s      0m3.922s
>
> I also experimented with the program on Linux (CEL 4 with 16
> CPUs). Initially, I saw no differences between the two versions.
> So I modified it a bit to make it closer to the library (the
> modified program is attached). With those changes the timings

I see the difference -- your program has a virtual function it calls from the 
inline grouping function.

> are below:
>
>           With cache   No cache
> real    0m 1.107s    0m 5.669s
> user    0m17.204s    0m 5.669s
> sys    0m 0.000s    0m22.347s
>
> I also recompiled and re-ran the test on Solaris. To speed
> things along, I set the number threads and loops to 8 and
> 1000000. The numbers are as follows:
>
>           With cache   No cache
> real    0m3.341s     0m26.333s
> user    0m13.052s    1m37.470s
> sys     0m0.009s     0m0.132s
>
> The numbers match my expectation. The overhead without the
> "numpunct cache" is considerable.

I have done another (smaller) round of measurements, this time using the test 
program you posted. Here are the results:

* iMac, 4x Intel, 12S:

16, 10000000:

         Cached       Not cached
real    0m9.300s     0m5.224s
user    0m36.441s    0m20.523s
sys     0m0.043s     0m0.068s

* iMac, 4x Intel, 12D:

         Cached        Not cached
real    0m9.012s      0m5.774s
user    0m35.343s     0m20.997s
sys     0m0.045s      0m0.183s

* Linux Slackware, 16x AMD Opteron, 12S:

16, 10000000:

         Cached     Not cached
real    0m29.798s  0m3.278s
user    0m48.662s  0m47.338s
sys     6m18.525s  0m3.298s

>
> Somewhat unexpectedly, the test with the cache didn't crash.

On my iMac it did not crash for me either (gcc 4.5.4), this time. On the other 
box (gcc 4.5.2) crashed every time with caching, so I had to add a call to 
fac.grouping outside the thread function to initialize the "facet".

Liviu

Mime
View raw message