mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Rojas <alexan...@mesosphere.io>
Subject Re: Review Request 52877: Fixed wrong float serialization in JSON in locales different from C.
Date Tue, 01 Nov 2016 14:44:08 GMT


> On Nov. 1, 2016, 2:43 p.m., Benjamin Bannier wrote:
> > 3rdparty/stout/include/stout/json.hpp, lines 680-681
> > <https://reviews.apache.org/r/52877/diff/2/?file=1550724#file1550724line680>
> >
> >     I believe we can drop the `static` here as these variables should have internal
linkage automatically.

Nope, `static` cannot be dropped, otherwise the following error is caused:

```
In file included from ../../../3rdparty/stout/tests/jsonify_tests.cpp:20:
../../../3rdparty/stout/include/stout/jsonify.hpp:154:9: error: '__thread' variables must
have global storage
        THREAD_LOCAL std::stringstream* ss = nullptr;
        ^
../../../3rdparty/stout/include/stout/thread_local.hpp:24:22: note: expanded from macro 'THREAD_LOCAL'
#define THREAD_LOCAL __thread
                     ^
```


> On Nov. 1, 2016, 2:43 p.m., Benjamin Bannier wrote:
> > 3rdparty/stout/include/stout/json.hpp, line 694
> > <https://reviews.apache.org/r/52877/diff/2/?file=1550724#file1550724line694>
> >
> >     Could you check if this temporary string is removed when compiling optimized?
If not IMHO it should be possible to avoid it by directly manipulating the underlying buffer's
state.

Manipulating the underlying buffer would end up calling the same methog but in the buffer
[std::basic_stringbuf<CharT>::str(std::basic_string<CharT> s)](http://en.cppreference.com/w/cpp/io/basic_stringbuf/str).
One can also attempt to modify the `seek` pointer position, but it may require write the tail
with zeroes.


> On Nov. 1, 2016, 2:43 p.m., Benjamin Bannier wrote:
> > 3rdparty/stout/include/stout/jsonify.hpp, lines 154-155
> > <https://reviews.apache.org/r/52877/diff/2/?file=1550725#file1550725line154>
> >
> >     Remove `static`.

see above.


> On Nov. 1, 2016, 2:43 p.m., Benjamin Bannier wrote:
> > 3rdparty/stout/include/stout/jsonify.hpp, line 181
> > <https://reviews.apache.org/r/52877/diff/2/?file=1550725#file1550725line181>
> >
> >     Investigate whether this creates a string in optimized code.

see above.


- Alexander


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52877/#review154397
-----------------------------------------------------------


On Nov. 1, 2016, 1:27 p.m., Alexander Rojas wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52877/
> -----------------------------------------------------------
> 
> (Updated Nov. 1, 2016, 1:27 p.m.)
> 
> 
> Review request for mesos, Adam B, Benjamin Bannier, Joris Van Remoortere, and Michael
Park.
> 
> 
> Bugs: MESOS-6349
>     https://issues.apache.org/jira/browse/MESOS-6349
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Resolves an isse with the JSON serialization functions where floats
> would use the process active locale as a decimal separator instead
> of the JSON conformant period (`.`).
> 
> 
> Diffs
> -----
> 
>   3rdparty/stout/include/stout/json.hpp 9734561549eea867d57f14a0ab71febeb9dddc06 
>   3rdparty/stout/include/stout/jsonify.hpp 8220d001001e8b9f247c05be58534f1174611aad 
> 
> Diff: https://reviews.apache.org/r/52877/diff/
> 
> 
> Testing
> -------
> 
> In order to test the different options the following benchmarking program was
> written using [Google benchmark](https://github.com/google/benchmark):
> 
> ```c++
> #include <benchmark/benchmark_api.h>
> #include <stout/json.hpp>
> 
> #include <iostream>
> #include <random>
> #include <string>
> #include <sstream>
> 
> static void BM_Jsonify(benchmark::State& state) {
>   while (state.KeepRunning()) {
>     state.PauseTiming();
>     JSON::Object object;
>     object.values["float00"] = 1234567890.12345;
>     object.values["float01"] = 123456789.012345;
>     object.values["float02"] = 12345678.9012345;
>     object.values["float03"] = 1234567.89012345;
>     object.values["float04"] = 123456.789012345;
>     object.values["float05"] = 12345.6789012345;
>     object.values["float06"] = 1234.56789012345;
>     object.values["float07"] = 123.456789012345;
>     object.values["float08"] = 12.3456789012345;
>     object.values["float09"] = 1.23456789012345;
>     object.values["float10"] = 1234567890.12345;
>     object.values["float11"] = 123456789.012345;
>     object.values["float12"] = 12345678.9012345;
>     object.values["float13"] = 1234567.89012345;
>     object.values["float14"] = 123456.789012345;
>     object.values["float15"] = 12345.6789012345;
>     object.values["float16"] = 1234.56789012345;
>     object.values["float17"] = 123.456789012345;
>     object.values["float18"] = 12.3456789012345;
>     object.values["float19"] = 1.23456789012345;
>     object.values["float20"] = 1234567890.12345;
>     object.values["float21"] = 123456789.012345;
>     object.values["float22"] = 12345678.9012345;
>     object.values["float23"] = 1234567.89012345;
>     object.values["float24"] = 123456.789012345;
>     object.values["float25"] = 12345.6789012345;
>     object.values["float26"] = 1234.56789012345;
>     object.values["float27"] = 123.456789012345;
>     object.values["float28"] = 12.3456789012345;
>     object.values["float29"] = 1.23456789012345;
>     object.values["float30"] = 1234567890.12345;
>     object.values["float31"] = 123456789.012345;
>     object.values["float32"] = 12345678.9012345;
>     object.values["float33"] = 1234567.89012345;
>     object.values["float34"] = 123456.789012345;
>     object.values["float35"] = 12345.6789012345;
>     object.values["float36"] = 1234.56789012345;
>     object.values["float37"] = 123.456789012345;
>     object.values["float38"] = 12.3456789012345;
>     object.values["float39"] = 1.23456789012345;
>     object.values["float40"] = 1234567890.12345;
>     object.values["float41"] = 123456789.012345;
>     object.values["float42"] = 12345678.9012345;
>     object.values["float43"] = 1234567.89012345;
>     object.values["float44"] = 123456.789012345;
>     object.values["float45"] = 12345.6789012345;
>     object.values["float46"] = 1234.56789012345;
>     object.values["float47"] = 123.456789012345;
>     object.values["float48"] = 12.3456789012345;
>     object.values["float49"] = 1.23456789012345;
> 
>     state.ResumeTiming();
> 
>     benchmark::DoNotOptimize(stringify(object));
>   }
> }
> 
> BENCHMARK(BM_Jsonify);
> 
> BENCHMARK_MAIN();
> ```
> 
> The program creates a JSON object with 50 floats in it, and then it creates
> its string representation. Since the algorithms in `json.hpp` and `jsonify.hpp`
> are similar but not the same, they were benchmarked as different. While the
> original solutions are non solutions since they produce erroneous results if
> localization is active, they constitute a baseline for comparison with the
> alternatives. These are the benchmarks for the original algorithms and their
> alternatives:
> 
> 1. `json.hpp` default:
> 
> ```c++
> char buffer[50] {};
> snprintf(
>     buffer,
>     sizeof(buffer),
>     "%#.*g",
>     std::numeric_limits<double>::digits10,
>     number.value);
> 
> std::string trimmed = strings::trim(buffer, strings::SUFFIX, "0");
> return out << trimmed << (trimmed.back() == '.' ? "0" : "");
> ```
> 
>    Perfomance
> 
> ```
> Run on (8 X 1000 MHz CPU s)
> 2016-11-01 11:08:49
> Benchmark    Time(ns)    CPU(ns) Iterations
> -------------------------------------------
> BM_Jsonify      62934      62927      11653
> ```
> 
> 2. `jsonify.hpp` default (it changes the call to `strings::trim` in order to
>    avoid string copies:
> 
> ```c++
> char buffer[50] {};
> const int size = snprintf(
>     buffer,
>     sizeof(buffer),
>     "%#.*g",
>     std::numeric_limits<double>::digits10,
>     number.value);
> 
> int back = size - 1;
> for (; back > 0; --back) {
>   if (buffer[back] != '0') {
>     break;
>   }
>   buffer[back] = '\0';
> }
> return out << buffer << (buffer[back] == '.' ? "0" : "");
> ```
> 
>    Performance
> 
> ```
> Run on (8 X 1000 MHz CPU s)
> 2016-11-01 11:10:21
> Benchmark    Time(ns)    CPU(ns) Iterations
> -------------------------------------------
> BM_Jsonify      58679      58658      11894
> ```
> 
> 3. Proposal 1, manually searching for a character which is not expected (and it
>    is assumed to be the decimal separator) and replacing it:
> 
> ```c++
> char buffer[50] {};
> snprintf(
>     buffer,
>     sizeof(buffer),
>     "%#.*g",
>     std::numeric_limits<double>::digits10,
>     number.value);
> 
> for (char &ch : buffer) {
>   if (!isdigit(ch) && ch != '-' && ch != '.' && ch!= '\0') {
>     ch = '.';
>     break;
>   }
> }
> 
> std::string trimmed = strings::trim(buffer, strings::SUFFIX, "0");
> return out << trimmed << (trimmed.back() == '.' ? "0" : "");
> ```
> 
>    Performance
> 
> ```
> Run on (8 X 1000 MHz CPU s)
> 2016-11-01 11:11:18
> Benchmark    Time(ns)    CPU(ns) Iterations
> -------------------------------------------
> BM_Jsonify      63559      63535      10825
> ```
> 
> 4. Proposal 2, the C++ approach, use streams and stream manipulators:
> 
> ```c++
> std::stringstream ss;
> ss.imbue(std::locale::classic());
> ss << std::setprecision(std::numeric_limits<double>::digits10)
>    << std::showpoint << number.value;
> 
> std::string trimmed = strings::trim(ss.str(), strings::SUFFIX, "0");
> return out << trimmed << (trimmed.back() == '.' ? "0" : "");
> ```
> 
>    Performance
> 
> ```
> Run on (8 X 1000 MHz CPU s)
> 2016-11-01 11:12:35
> Benchmark    Time(ns)    CPU(ns) Iterations
> -------------------------------------------
> BM_Jsonify      74237      74242       8678
> ```
> 
> 5. Proposal 3, try to avoid copies (there are still some being created):
> 
> ```c++
> std::stringstream ss;
> ss.imbue(std::locale::classic());
> ss << std::setprecision(std::numeric_limits<double>::digits10)
>    << std::showpoint << number.value;
> 
> std::string representation = ss.str();
> int back = representation.size() - 1;
> for (; back > 0; --back) {
>   if (representation[back] != '0') {
>     break;
>   }
> }
> representation.resize(back + 1);
> out << representation << (representation[back] == '.' ? "0" : "");
> ```
> 
>    Performance
> 
> ```
> Run on (8 X 1000 MHz CPU s)
> 2016-11-01 11:13:40
> Benchmark    Time(ns)    CPU(ns) Iterations
> -------------------------------------------
> BM_Jsonify      73785      73757       9534
> ```
> 
> 6. Using thread local variables:
> 
> ```c++
> static THREAD_LOCAL std::stringstream* ss;
> static THREAD_LOCAL std::once_flag once_flag;
> std::call_once(once_flag, [=]() {
>   ss = new std::stringstream;
>   ss->imbue(std::locale::classic());
> });
> *ss << std::setprecision(std::numeric_limits<double>::digits10)
>     << std::showpoint << number.value;.
>       std::string trimmed = strings::trim(ss->str(), strings::SUFFIX, "0");
> out << trimmed << (trimmed.back() == '.' ? "0" : "");
> ss->str("");
> return out;
> ```
> 
>    Performance
> 
> ```
> Run on (8 X 1000 MHz CPU s)
> 2016-11-01 11:14:45
> Benchmark    Time(ns)    CPU(ns) Iterations
> -------------------------------------------
> BM_Jsonify      68407      68415       9716
> ```
> 
> 
> Thanks,
> 
> Alexander Rojas
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message