mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jiang Yan Xu <...@jxu.me>
Subject Re: Review Request 63174: Added a benchmark for agent reregistration during master failover.
Date Fri, 03 Nov 2017 18:10:41 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63174/
-----------------------------------------------------------

(Updated Nov. 3, 2017, 11:10 a.m.)


Review request for mesos, Benjamin Mahler, Dmitry Zhuk, and Ilya Pronin.


Changes
-------

Addressed comment. NNFR.


Bugs: MESOS-8098
    https://issues.apache.org/jira/browse/MESOS-8098


Repository: mesos


Description
-------

The current benchmark is very simple: without framework involvement and without agent retries
but it's possible to add a number of others so I am creating a new file for them.


Diffs (updated)
-----

  src/Makefile.am 1c97b1fd8151f87c4e9e6d62884b0ef7d582c312 
  src/tests/CMakeLists.txt 386e0473c93d0a993248c7818067071d0c761c76 
  src/tests/master_benchmarks.cpp PRE-CREATION 


Diff: https://reviews.apache.org/r/63174/diff/4/

Changes: https://reviews.apache.org/r/63174/diff/3-4/


Testing
-------

Benchmark based off https://github.com/apache/mesos/commit/41193181d6b75eeecae2729bf98007d9318e351a
(close to current HEAD).

```
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
Starting reregistration for all agents
Reregistered 2000 agents with a total of 100000 running tasks and 100000 completed tasks in
11.188008209secs
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
(22404 ms)
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
Starting reregistration for all agents
Reregistered 2000 agents with a total of 200000 running tasks and 0 completed tasks in 20.868372615secs
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
(37981 ms)
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
Starting reregistration for all agents
Reregistered 20000 agents with a total of 100000 running tasks and 0 completed tasks in 15.354579251secs
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
(33766 ms)
[----------] 3 tests from AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (94151 ms
total)


[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
Starting reregistration for all agents
Reregistered 2000 agents with a total of 100000 running tasks and 100000 completed tasks in
11.045441129secs
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
(19959 ms)
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
Starting reregistration for all agents
Reregistered 2000 agents with a total of 200000 running tasks and 0 completed tasks in 21.324309077secs
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
(38490 ms)
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
Starting reregistration for all agents
Reregistered 20000 agents with a total of 100000 running tasks and 0 completed tasks in 14.68607521secs
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
(32073 ms)
[----------] 3 tests from AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (90523 ms
total)

```

Benchmark based off https://github.com/apache/mesos/commit/d9c90bf1d9c8b3a7dcc47be0cb773efff57cfb9d
(before https://issues.apache.org/jira/browse/MESOS-7713 was merged)

```
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
Starting reregistration for all agents
Reregistered 2000 agents with a total of 100000 running tasks and 100000 completed tasks in
23.217901878secs
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
(38327 ms)
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
Starting reregistration for all agents
Reregistered 2000 agents with a total of 200000 running tasks and 0 completed tasks in 46.158610597secs
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
(75280 ms)
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
Starting reregistration for all agents
Reregistered 20000 agents with a total of 100000 running tasks and 0 completed tasks in 38.56781112secs
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
(68006 ms)
[----------] 3 tests from AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (181613 ms
total)

[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
Starting reregistration for all agents
Reregistered 2000 agents with a total of 100000 running tasks and 100000 completed tasks in
25.752844224secs
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/0
(43509 ms)
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
Starting reregistration for all agents
Reregistered 2000 agents with a total of 200000 running tasks and 0 completed tasks in 45.190859035secs
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/1
(73966 ms)
[ RUN      ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
Starting reregistration for all agents
Reregistered 20000 agents with a total of 100000 running tasks and 0 completed tasks in 36.322992753secs
[       OK ] AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test.AgentReregistrationDelay/2
(66946 ms)
[----------] 3 tests from AgentFrameworkTaskCount/MasterFailover_BENCHMARK_Test (184421 ms
total)
```

The recently patches cut down the time by over 50%. These were built with `--enable-optimize
--enable-lock-free-run-queue --enable-lock-free-event-queue --enable-last-in-first-out-fixed-size-semaphore`.


Thanks,

Jiang Yan Xu


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message