mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Mahler <bmah...@apache.org>
Subject Re: Review Request 58355: Removed unnecessary Registry copying.
Date Mon, 17 Apr 2017 23:21:47 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58355/#review172147
-----------------------------------------------------------



Thanks for taking this on Ilya! Just a few suggestions below per our offline discussion. It
would be great to update the description to include the context, mainly that we avoid the
`protobuf::State` wrapper in favor of performing manual serialization/deserialization in the
registrar, in order to avoid expensive protobuf copying. Also that, we should explore performance
improvements to `protobuf::State` in order to make it usable in the registrar without regressing
on the performance of this approach.


src/master/registrar.cpp
Line 215 (original), 217-218 (patched)
<https://reviews.apache.org/r/58355/#comment245249>

    It would be nice to store the `State` class alongside these and put the TODO here:
    
    ```
      // TODO(<username>): We use the "untyped" `State` class here and perform
      // the protobuf (de)serialization manually within the Registrar, because
      // the use of `protobuf::State` incurs a dramatic peformance cost from
      // protobuf copying. We should explore using `protobuf::State`, which will
      // require move support and other copy elimination to maintain the
      // performance of the current approach.
      State* state;
      
      // Per the TODO above, we store both serialized and deserialized versions
      // of the `Registry` protobuf. If we're able to move to `protobuf::State`,
      // we could just store a single `protobuf::state::Variable<Registry>`.
      Option<Variable> variable;
      Option<Registry> registry;
    ```
    
    By the way, would be nice if these both use or don't use underscores consistently.



src/master/registrar.cpp
Line 215 (original), 217-218 (patched)
<https://reviews.apache.org/r/58355/#comment245251>

    



src/master/registrar.cpp
Line 334 (original), 337 (patched)
<https://reviews.apache.org/r/58355/#comment245247>

    Chatted with Ilya offline about using the raw untyped `State` class rather than using
the protobuf class and bypassing the overridden method here.
    
    We can then frame this patch as being an avoidance of the protobuf `State` wrapper due
to the extra expense incurred from protobuf copying, and leave a TODO in the code that captures
this for posterity.



src/master/registrar.cpp
Lines 415 (patched)
<https://reviews.apache.org/r/58355/#comment245248>

    



src/master/registrar.cpp
Lines 449-450 (original), 464-465 (patched)
<https://reviews.apache.org/r/58355/#comment245253>

    Should we say we use an Owned here to avoid copying, since protobuf doesn't support move
construction?


- Benjamin Mahler


On April 11, 2017, 3:33 p.m., Ilya Pronin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58355/
> -----------------------------------------------------------
> 
> (Updated April 11, 2017, 3:33 p.m.)
> 
> 
> Review request for mesos and Benjamin Mahler.
> 
> 
> Bugs: MESOS-7376
>     https://issues.apache.org/jira/browse/MESOS-7376
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> When the number of agents is large every `Registry` copy operation takes
> a lot of time (~0.4 sec with 55k agents) because it involves deep
> copying a big object graph.
> 
> This patch minimizes `Registry` copying to keep registry update timings
> at acceptable values.
> 
> 
> Diffs
> -----
> 
>   include/mesos/state/protobuf.hpp c15696ab79b61f5487ee4a849d62b34b91ca1614 
>   src/master/registrar.cpp 0029cc77628b5bb2a7b1ff551fb42b3eaf7b4fb1 
> 
> 
> Diff: https://reviews.apache.org/r/58355/diff/1/
> 
> 
> Testing
> -------
> 
> Benchmark
> =========
> Ran `MESOS_VERBOSE=1 ./src/mesos-tests --benchmark --gtest_filter='*Registrar_BENCHMARK_Test.Performance/3'`.
> 
> Before:
> ```
> I0411 10:04:11.726016 11802 registrar.cpp:508] Successfully updated the registry in 89.478144ms
> I0411 10:04:13.860827 11803 registrar.cpp:508] Successfully updated the registry in 216.688896ms
> I0411 10:04:15.167768 11803 registrar.cpp:508] Successfully updated the registry in 1.29364096secs
> I0411 10:04:18.967394 11803 registrar.cpp:508] Successfully updated the registry in 3.696552192secs
> I0411 10:04:25.631009 11803 registrar.cpp:508] Successfully updated the registry in 6.267425024secs
> I0411 10:04:42.625507 11803 registrar.cpp:508] Successfully updated the registry in 15.876419072secs
> I0411 10:04:44.209377 11787 registrar_tests.cpp:1262] Admitted 50000 agents in 30.479743816secs
> I0411 10:05:04.446650 11820 registrar.cpp:508] Successfully updated the registry in 18.338545152secs
> I0411 10:05:21.171001 11820 registrar.cpp:508] Successfully updated the registry in 15.31903872secs
> I0411 10:05:37.592319 11820 registrar.cpp:508] Successfully updated the registry in 14.863101952secs
> I0411 10:05:39.099174 11787 registrar_tests.cpp:1276] Marked 50000 agents reachable in
53.593596102secs
> ../../src/tests/registrar_tests.cpp:1287: Failure
> Failed to wait 15secs for registry
> ```
> After:
> ```
> I0411 15:19:12.228904 40643 registrar.cpp:524] Successfully updated the registry in 91.262208ms
> I0411 15:19:14.543190 40660 registrar.cpp:524] Successfully updated the registry in 377.45408ms
> I0411 15:19:15.707006 40660 registrar.cpp:524] Successfully updated the registry in 1.138724096secs
> I0411 15:19:18.267305 40660 registrar.cpp:524] Successfully updated the registry in 2.466145792secs
> I0411 15:19:19.092073 40660 registrar.cpp:524] Successfully updated the registry in 523.11296ms
> I0411 15:19:20.809330 40648 registrar.cpp:524] Successfully updated the registry in 892.141824ms
> I0411 15:19:21.194135 40622 registrar_tests.cpp:1262] Admitted 50000 agents in 6.938952085secs
> I0411 15:19:26.973904 40637 registrar.cpp:524] Successfully updated the registry in 3.938064128secs
> I0411 15:19:28.631865 40637 registrar.cpp:524] Successfully updated the registry in 1.116326144secs
> I0411 15:19:30.222944 40660 registrar.cpp:524] Successfully updated the registry in 911.86688ms
> I0411 15:19:30.678509 40622 registrar_tests.cpp:1276] Marked 50000 agents reachable in
8.249523305secs
> I0411 15:19:35.138797 40645 registrar.cpp:524] Successfully updated the registry in 815.439104ms
> I0411 15:19:41.783651 40622 registrar_tests.cpp:1288] Recovered 50000 agents (8238915B)
in 10.963297677secs
> I0411 15:19:47.431670 40657 registrar.cpp:524] Successfully updated the registry in 3.960920064secs
> I0411 15:20:13.769872 40657 registrar.cpp:524] Successfully updated the registry in 1.169234944secs
> I0411 15:21:49.685801 40657 registrar.cpp:524] Successfully updated the registry in 264.850688ms
> Removed 50000 agents in 2.12256788111667mins
> ```
> Agents removal seems to be O(n^2) because of linear lookup, hence 2 mins.
> 
> Unit
> ====
> Ran `make check`.
> 
> 
> Thanks,
> 
> Ilya Pronin
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message