incubator-stdcxx-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Liviu Nicoara <nikko...@hates.ms>
Subject Re: STDCXX-1056 : numpunct fix
Date Wed, 26 Sep 2012 01:03:16 GMT
On 9/24/12 11:06 PM, Stefan Teleman wrote:
> On Mon, Sep 24, 2012 at 7:48 PM, Liviu Nicoara <nikkoara@hates.ms> wrote:
>
>> Stefan, was it your intention to completely eliminate all the race
>> conditions with this last patch? Is this what the tools showed in your
>> environment?
>
>>> https://issues.apache.org/jira/browse/STDCXX-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13461452#comment-13461452
>
> Yes, all the race conditions in std::numpunct<> and std::moneypunct<>.
> Not all the race conditions in stdcxx.

FWIW, I have analyzed the performance of the latest patch from Stefan. The test 
program is the same test program used in the timings attachments to the 
incident. The version of the library was 4.2.1, compiler gcc 4.7.1, with:

1. The simple, perfect forwarding patch, with no caching, which does eliminate 
race conditions.
2. Stefan's latest patch
3. A patch I created which truly, if crudely, eliminates all facet race 
conditions by using a top-level lock for all facet operations(*)

The numbers from the first test run are already available in the incident.

The numbers for the second test run were a bit bigger, which is expected because 
of the additional synchronization. E.g., I have found that the test, run against 
the STDCXX locale database, gives:

$ time ./t en_US.UTF-8 4 5000000
4, 5000000

real	0m19.210s
user	0m32.659s
sys	0m36.507s

where the first patch gives:

$ time ./t en_US.UTF-8 4 50000000; done
4, 50000000

real    0m7.961s
user    0m31.211s
sys     0m0.003s

Notice the number of loops, 10x larger in the second set of number.

Now, for the third patch, the numbers are embarrassingly large. It practically 
slows the program to a crawl because it does not do any reads of facet data 
without holding the lock. This, I wager, eliminates all races in the 
numpunct/moneypunct facet. See:

$ time ./t en_US.UTF-8 4 100000
4, 100000

real	0m24.316s
user	0m28.213s
sys	0m6.633s

Now that is a whooping performance hit.

I went back to see an explanation between the numbers I see with Stefan's more 
performing patch, and my lock-all patch. I believe that there are still places 
in Stefan's patch where the facet makes unguarded reads of facet data, e.g.:

loc/_numpunct.h:
     190     // returns a pointer to the facet's implementation data
     191     // if it exists, 0 otherwise
     192     const void* _C_data () const {
     193         return _C_impsize ? _C_impdata
     194             : _RWSTD_CONST_CAST (__rw_facet*, this)->_C_get_data ();
     195     }

where _C_impsize/_C_impdata are read outside of a lock scope, etc. I.e., a 
thread may read the _C_impsize while, say, another is still writing the 
_C_impdata member variable.

I conclude, even with such simple testing, that any solution that attempts to 
completely eliminate the race conditions in these facets will incur a penalty, 
with the safest (top-level locking) the worst by orders of magnitude and any 
other solution somewhere in between.

Please let me know if you have any questions.

Liviu

(*) Will attach it as soon as we figure out how to undo the closing of the issue.

Mime
View raw message