Return-Path: X-Original-To: apmail-stdcxx-dev-archive@www.apache.org Delivered-To: apmail-stdcxx-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C3D0CD0A0 for ; Mon, 10 Sep 2012 18:20:49 +0000 (UTC) Received: (qmail 37683 invoked by uid 500); 10 Sep 2012 18:20:49 -0000 Delivered-To: apmail-stdcxx-dev-archive@stdcxx.apache.org Received: (qmail 37599 invoked by uid 500); 10 Sep 2012 18:20:49 -0000 Mailing-List: contact dev-help@stdcxx.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@stdcxx.apache.org Delivered-To: mailing list dev@stdcxx.apache.org Received: (qmail 37590 invoked by uid 99); 10 Sep 2012 18:20:49 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Sep 2012 18:20:49 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [64.34.174.152] (HELO hates.ms) (64.34.174.152) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Sep 2012 18:20:40 +0000 Received: from [192.168.72.105] (unknown [166.57.38.196]) by hates.ms (Postfix) with ESMTPSA id 980A645C1A9 for ; Mon, 10 Sep 2012 18:20:18 +0000 (UTC) Message-ID: <504E2FAE.3010203@hates.ms> Date: Mon, 10 Sep 2012 14:21:34 -0400 From: Liviu Nicoara User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:15.0) Gecko/20120824 Thunderbird/15.0 MIME-Version: 1.0 To: dev@stdcxx.apache.org Subject: Re: STDCXX-1056 [was: Re: STDCXX forks] References: <40394653-8FCC-4D04-A108-2C650AF8F95B@hates.ms> <5045E764.9090607@hates.ms> <595887D2-6E42-4BC4-AF69-085AE4BA8A7D@hates.ms> <5046BDC1.3020400@gmail.com> <50476748.9040301@gmail.com> <50479CA6.8010306@hates.ms> <5047A561.5050606@gmail.com> <5047A926.7070308@hates.ms> <5047B01E.2060506@gmail.com> <504937AC.70102@gmail.com> In-Reply-To: <504937AC.70102@gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit On 09/06/12 19:54, Martin Sebor wrote: >>> I'm not sure how easily we can do that. Almost all of locale >>> is initialized lazily. Some of the layers might depend on the >>> facets being initialized lazily as well. This was a deliberate >>> design choice. One of the constraints was to avoid dynamic >>> initialization or allocation at startup. [...] >> >> There would be a performance degradation. IMHO, it would be minor and would simplify the code considerably. I have collected some numbers over the w/e. Using the test program I posted earlier, with minor tweaks, I timed a number of approaches. In each test I obtained the grouping string object and used its c_str method in an strcmp to a known string, in a tight loop, in multiple threads. The results seem to favor a non-caching implementation: Times: 1. With current code, caching, no locking, safe because we initialize the grouping member outside the thread function: real 0m45.414s user 1m3.147s sys 9m40.410s 2. With caching of grouping value and DCII, using two additional mutex member vars in std::numpunct: real 0m34.360s user 0m52.313s sys 8m2.001s 3. With caching and DCII, using an additional mutex member in std::numpunct and an atomic exchange of the flag: real 0m34.073s user 0m52.028s sys 7m57.889s 4. Without caching of grouping values, grouping() delegates always to do_grouping(): real 0m5.668s user 1m11.389s sys 0m3.952s Thanks. Liviu The test program: $ cat t.cpp #include #include #include #include #include #include #include #define MAX_THREADS 16 #define MAX_LOOPS 10000000 static bool volatile hold = true; typedef std::numpunct Numpunct; extern "C" { static void* f (void* pv) { Numpunct const& fac = *reinterpret_cast< Numpunct* > (pv); while (hold) ; for (int i = 0; i < MAX_LOOPS; ++i) { const std::string grouping = fac.grouping (); if (strcmp (grouping.c_str (), "\003\003")) { abort (); } } return 0; } } int main (int, char** argv) { std::locale const loc = std::locale (argv [1]); Numpunct const& fac = std::use_facet (loc); fac.grouping (); // Only for testing the current revision! pthread_t tid [MAX_THREADS] = { 0 }; for (int i = 0; i < MAX_THREADS; ++i) { if (pthread_create (tid + i, 0, f, const_cast (&fac))) exit (-1); } sleep (1); hold = false; for (int i = 0; i < MAX_THREADS; ++i) { if (tid [i]) pthread_join (tid [i], 0); } return 0; } The relevant facet code: [...] private: int _C_flags; // bitmap of "cached data valid" flags string _C_grouping; // cached results of virtual members string_type _C_truename; string_type _C_falsename; char_type _C_decimal_point; char_type _C_thousands_sep; _RW::__rw_mutex _C_mutex1; _RW::__rw_mutex _C_mutex2; }; [...] template inline string numpunct<_CharT>::grouping () const { #if 1 if (!(_C_flags & _RW::__rw_gr)) { numpunct* const __self = _RWSTD_CONST_CAST (numpunct*, this); // [try to] get the grouping first (may throw) // then set a flag to avoid future initializations __self->_C_grouping = do_grouping (); __self->_C_flags |= _RW::__rw_gr; } return _C_grouping; #elif 0 if (!(_C_flags & _RW::__rw_gr)) { numpunct* const __self = _RWSTD_CONST_CAST (numpunct*, this); _RWSTD_MT_GUARD (__self->_C_mutex1); if (!(_C_flags & _RW::__rw_gr)) { // [try to] get the grouping first (may throw) // then set a flag to avoid future initializations __self->_C_grouping = do_grouping (); // Atomic exchange has acquire and release semantics on // x86 and x86_64. Can still be re-ordered by the compiler. int tmp = __self->_C_flags |= _RW::__rw_gr; _RW::__rw_atomic_exchange (__self->_C_flags, tmp, true); } } return _C_grouping; #elif 0 if (!(_C_flags & _RW::__rw_gr)) { numpunct* const __self = _RWSTD_CONST_CAST (numpunct*, this); _RWSTD_MT_GUARD (__self->_C_mutex1); if (!(_C_flags & _RW::__rw_gr)) { // [try to] get the grouping first (may throw) // then set a flag to avoid future initializations __self->_C_grouping = do_grouping (); // Forces the compiler to preserve the order and introduces //barriers. _RWSTD_MT_GUARD (__self->_C_mutex2); __self->_C_flags |= _RW::__rw_gr; } } return _C_grouping; #else return do_grouping (); #endif // 0 }