incubator-stdcxx-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Travis Vitek <Travis.Vi...@roguewave.com>
Subject RE: STDCXX-1056 : numpunct fix
Date Thu, 20 Sep 2012 20:45:00 GMT


> -----Original Message-----
> From: Stefan Teleman [mailto:stefan.teleman@gmail.com]
> Sent: Thursday, September 20, 2012 10:11 AM
> To: dev@stdcxx.apache.org
> Subject: Re: STDCXX-1056 : numpunct fix
> 
> On Thu, Sep 20, 2012 at 8:07 AM, Liviu Nicoara <nikkoara@hates.ms>
> wrote:
> > But have you measured the amount of memory consumed by all STDCXX
> locale data loaded in one process? How much absolute time is spent in
> resizing the locale and facet buffers? What is the gain in space and
> time performance with such a change versus without? Just how fragmented
> the heap becomes and is there a performance impact because of it, etc.?
> IOW, before changing the status quo one must show an objective defect,
> produce a body of evidence, including a failing test case for the
> argument.
> 
> sizeof(std::locale) * number_of_locales.
> 
> I'll let you in on a little secret: once you call setlocale(3C) and
> localeconv(3C), the Standard C Library doesn't release its own locale
> handles until process termination. So you might think you save a lot
> of memory by destroying and constructing the same locales. You're
> really not. It's the Standard C Library locale data which takes up a
> lot of space.

You have a working knowledge of all Standard C Library implementations?

> 
> What I do know for a fact that this "optimization" did, was to cause
> the races conditions reported by 4 different thread analyzers. Race
> conditions are a show-stopper for me, and they are not negotiable.

The following is found near the top of the _C_manage method of __rw_facet.

    // acquire lock
    _RWSTD_MT_STATIC_GUARD (_RW::__rw_facet);

None of the shared data related to is read/written outside of the critical section protected
by that lock, and given the declaration for that shared data it cannot be accessed by any
code outside that function. Put bluntly, there is no way that there is a race condition relating
to the caching code itself.

Your Performance Analyzer output indicates a race (7 race accesses) for _C_manage...

  http://s247136804.onlinehome.us/22.locale.numpunct.mt.1.er.ts/

Specifically, it is calling out the following block of code.

##  7        0             488.             *__rw_access::_C_get_pid (*pfacet) =
                           489.                 _RWSTD_STATIC_CAST (_RWSTD_SIZE_T, (type +
1) / 2);

The function _C_get_pid simply exposes a reference to a data member of the given facet. That
function is thread safe. Provided that pfacet (the parameter passed to _C_manage) isn't being
accessed by another thread, there is no way that this code is not safe. It is possible that
calling code is not safe, but this code is clean.

Regardless, the proposed patch to _C_manage does nothing to change this block of code. I do
not understand how you can claim that this change eliminated the race conditions you are so
offended by. It is possible that other changes you have made eliminated the data races, but
I do not see how this change has any effect.

> 
> > And I love minimalistic code, and hate waste at the same time,
> especially in a general purpose library. To each its own.
> 
> Here's a helpful quote:
> 
> "We should forget about small efficiencies, say about 97% of the time:
> premature optimization is the root of all evil". It's from Donald
> Knuth.

By that measure, your entire patch could be considered evil. I've seen no evidence that the
subsequent two allocation/copy/deallocate/sort cycles required to get from 8 to 64 entries
is measurably more expensive, and I've seen nothing to indicate that a normal application
using the C++ Standard Library would be creating and destroying locale instances in large
numbers, or that doing so has a measureable impact on performance.

> And I love correct code which doesn't cause thread analyzers to report
> more than 12000 race conditions for just one test case. I've said it
> before and I will say it again: race conditions are a showstopper and
> are not negotiable. Period.

When the code in question has 12 threads that invoke a function 1000 times, you've found 1
race condition. I do agree data races are bad and should be fixed. But making changes to 'optimize'
the code instead of fixing it is actually much worse.

> 
> The patch is in scope for the issue at hand. The issue is that
> std::numpunct and std::moneypunct are not thread safe. This has been
> confirmed by 4 different thread analyzers, even after applying your
> _numpunct.h patch.

I looked at the output from the thread analyzer. It points out a data race in __rw::__rw_allocate(),
indicating that a memset() is responsible for a data race...

  http://s247136804.onlinehome.us/22.locale.numpunct.mt.1.er.ts/file.14.src.txt.html#line_43

Assuming that `operator new' is indeed thread safe (I didn't bother to look), I'm curious
to hear how this is an actual data race. I'm also curious to hear how you managed to avoid
having the same race appear in the output that you submitted with the proposed patch.

> You are more than welcome to submit your own patch which eliminates
> 100% of the race conditions.
> 
> --Stefan
> 
> --
> Stefan Teleman
> KDE e.V.
> stefan.teleman@gmail.com
Mime
View raw message