stdcxx-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Liviu Nicoara <nikko...@hates.ms>
Subject Re: STDCXX-1056 : numpunct fix
Date Thu, 20 Sep 2012 12:07:06 GMT
Thanks for the feed-back. Please see below.


On Sep 19, 2012, at 10:02 PM, Stefan Teleman wrote:

> On Wed, Sep 19, 2012 at 8:51 PM, Liviu Nicoara <nikkoara@hates.ms> wrote:
> 
>> I think you are referring to `live' cache objects and the code which
>> specifically adjusts the size of the buffer according to the number of
>> `live' locales and/or facets in it. In that respect I would not call that
>> eviction because locales and facets with non-zero reference counters are
>> never evicted.
>> 
>> But anyhoo, this is semantics. Bottom line is the locale/facet buffer
>> management code follows a principle of economy.
> 
> Yes it does. But we have to choose between economy and efficiency. To
> clarify: The overhead of having unused pointers in the cache is
> sizeof(void*) times the number of unused "slots".  This is 2012. Even
> an entry-level Android cell phone comes with 1GB system memory. If we
> want to talk about embedded systems, where memory constraints are more
> stringent than cell phones, then we're not talking about Apache stdcxx
> anymore, or any other open souce of the C++ Standard Library. These
> types of systems use C++ for embedded systems, which is a different
> animal altogether: no exceptions support, no rtti. For example see,
> Green Hills: http://www.ghs.com/ec++.html. And even they have become
> more relaxed about memory constraints. They use BOOST.
> 
> Bottom line: so what if 16 pointers in this 32 pointer slots cache
> never get used. The maximum amount of "wasted memory" for these 16
> pointers is 128 bytes, on a 64-bit machine with 8-byte sized pointers.
> Can we live with that in 2012, a year when a $500 laptop comes with
> 4GB RAM out of the box? I would pick 128 bytes of allocated but unused
> memory over random and entirely avoidable memory churn any day.


The argument is plausible and fine as far as brainstorming goes. 

But have you measured the amount of memory consumed by all STDCXX locale data loaded in one
process? How much absolute time is spent in resizing the locale and facet buffers? What is
the gain in space and time performance with such a change versus without? Just how fragmented
the heap becomes and is there a performance impact because of it, etc.? IOW, before changing
the status quo one must show an objective defect, produce a body of evidence, including a
failing test case for the argument.


> 
> My goal: I would be very happy if any application using Apache stdcxx
> would reach its peak instantiation level of localization (read: max
> number of locales and facets instantiated and cached, for the
> application's particular use case), and would then stabilize at that
> level *without* having to resize and re-sort the cache, *ever*. That
> is a locale cache I can love. I love binary searches on sorted
> containers. Wrecking the container with insertions or deletions, and
> then having to re-sort it again, not so much. Especially when I can't
> figure out why we're doing it in the first place.


And I love minimalistic code, and hate waste at the same time, especially in a general purpose
library. To each its own.


> 
>> Hey Stefan, are the above also timing the changes?
> 
> Nah, I didn't bother with the timings - yet - for a very simple
> reason: in order to use instrumentation, both with SunPro and with
> Intel compilers, optimization of any kind must be disabled. On SunPro
> you have to pass -xkeepframe=%all (which disables tail-call
> optimization as well), in addition to passing -xO0 and -g. So the
> timings for these unoptimized experiments would have been completely
> irrelevant.

Well, I think you are the only one around here with access to SPARC hardware, your input is
very precious in this sense. Also, this is the reason for which I kept asking that question
earlier: do we have currently any failing locale MT test when numpunct does just perfect forwarding,
with no caching? I.e., changing just _numpunct.h and no other source file (as to silence thread
analyzers warnings) does any locale (or other) MT tests fail? I would greatly appreciate it
if you could give it a run on your hardware if you don't already know the answer.

The discussion has been productive. But I object to the patch as is because it goes out of
the scope of the original incident. I think this patch should only touch the MT defect detected
by the failing test cases. If you think the other parts you changed are defects you should
open corresponding issues in JIRA and have them discussed in their separate rooms.

Thanks,
Liviu
Mime
View raw message