aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aurora ReviewBot <wfar...@apache.org>
Subject Re: Review Request 60748: Prototype using cgroups for monitoring Thermos Process resource consumption (CPU and memory)
Date Mon, 10 Jul 2017 18:36:33 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60748/#review180080
-----------------------------------------------------------



Master (a922b05) is red with this patch.
  ./build-support/jenkins/build.sh

@@ -29,7 +29,6 @@
 import subprocess
 import sys
 import time
-
 from abc import abstractmethod
 from copy import deepcopy
 
ERROR: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/thermos/monitoring/process_collector_cgroup.py
Imports are incorrectly sorted.
--- /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/thermos/monitoring/process_collector_cgroup.py:before
2017-07-10 18:31:05.912538
+++ /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/thermos/monitoring/process_collector_cgroup.py:after
2017-07-10 18:36:31.156897
@@ -14,13 +14,14 @@
 
 """ Sample resource consumption statistics for processes using psutil """
 
+import traceback
 from operator import attrgetter
 from time import time
-import traceback
 
 from twitter.common import log
 
 from apache.thermos.core.cgroup import ControlGroupHelper
+
 from .process import ProcessSample
 
 
ERROR: /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/thermos/monitoring/resource.py
Imports are incorrectly sorted.
--- /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/thermos/monitoring/resource.py:before
2017-07-10 18:31:05.912538
+++ /home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/python/apache/thermos/monitoring/resource.py:after
2017-07-10 18:36:31.171885
@@ -43,8 +43,8 @@
 
 from .disk import DiskCollector
 from .process import ProcessSample
+from .process_collector_cgroup import ProcessCollector
 from .process_collector_psutil import ProcessTreeCollector
-from .process_collector_cgroup import ProcessCollector
 
 
 class ResourceMonitorBase(Interface):


I will refresh this build result if you post a review containing "@ReviewBot retry"

- Aurora ReviewBot


On July 10, 2017, 6:30 p.m., Reza Motamedi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60748/
> -----------------------------------------------------------
> 
> (Updated July 10, 2017, 6:30 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Santhosh Kumar Shanmugham, Stephan Erb,
and Zameer Manji.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> # Prototype using cgroups for monitoring Thermos Process resource consumption (CPU and
memory)
> The idea behind this prototype is to use kernel cgroups instead of per pid monitoring
of Thermos Tasks and Processes.
> This [document](https://docs.google.com/a/twitter.com/document/d/16JFIqY2ftvNNXxYf6jQwO6EXPajCKp7kPJRAQSsaPko/edit?usp=sharing)
describes more about the problem that this prototype tries to solve.
> 
> __Note:__ Since I am piggybacking on the cgroup clean-up implemented in Mesos, if Mesos's
memory and CPU isolation are not enabled, I will not create cgroups and will simply revert
to using old monitoring scheme. 
> 
> # Notes on Performance:
> 
> I used `top -p <thermos-pid> -bc -n 10 | grep 'python'` to monitor the cpu usage
of thermos on my vagrant. I had 7 Tasks each with 3 Processes.
> > Stock Thermos Observer
> ```
> 21641 root      20   0 1351200  44448   4088 S   6.6  1.4   0:35.69 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=NONE --log_to_stderr=google:INFO
> 21641 root      20   0 1351200  44448   4088 S   2.7  1.4   0:35.77 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=NONE --log_to_stderr=google:INFO
> 21641 root      20   0 1351200  44448   4088 S   3.3  1.4   0:35.87 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=NONE --log_to_stderr=google:INFO
> 21641 root      20   0 1351200  44448   4088 S   2.3  1.4   0:35.94 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=NONE --log_to_stderr=google:INFO
> 21641 root      20   0 1351200  44448   4088 S   4.3  1.4   0:36.07 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=NONE --log_to_stderr=google:INFO
> 21641 root      20   0 1351200  44448   4088 S   3.6  1.4   0:36.18 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=NONE --log_to_stderr=google:INFO
> 21641 root      20   0 1351204  44616   4088 S  11.6  1.4   0:36.53 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=NONE --log_to_stderr=google:INFO
> 21641 root      20   0 1351200  44552   4088 S  39.6  1.4   0:37.72 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=NONE --log_to_stderr=google:INFO
> 21641 root      20   0 1351200  44552   4088 S   2.7  1.4   0:37.80 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=NONE --log_to_stderr=google:INFO
> 21641 root      20   0 1351200  44552   4088 S   7.6  1.4   0:38.03 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=NONE --log_to_stderr=google:INFO
> ```
> > Thermos Observer using CGROUP monitoring
> ```
> 15203 root      20   0 1367828  45344   4088 S   6.6  1.5   0:55.37 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=DEBUG --log_to_stderr=google:INFO
> 15203 root      20   0 1367828  45344   4088 S   2.0  1.5   0:55.43 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=DEBUG --log_to_stderr=google:INFO
> 15203 root      20   0 1351436  45308   4088 S   4.3  1.5   0:55.56 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=DEBUG --log_to_stderr=google:INFO
> 15203 root      20   0 1351436  45308   4088 S   2.3  1.5   0:55.63 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=DEBUG --log_to_stderr=google:INFO
> 15203 root      20   0 1351436  45308   4088 S   2.0  1.5   0:55.69 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=DEBUG --log_to_stderr=google:INFO
> 15203 root      20   0 1351436  45308   4088 S   3.3  1.5   0:55.79 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=DEBUG --log_to_stderr=google:INFO
> 15203 root      20   0 1351436  45308   4088 S   2.3  1.5   0:55.86 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=DEBUG --log_to_stderr=google:INFO
> 15203 root      20   0 1351436  45308   4088 S   1.0  1.5   0:55.89 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=DEBUG --log_to_stderr=google:INFO
> 15203 root      20   0 1351436  45308   4088 S   2.3  1.5   0:55.96 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=DEBUG --log_to_stderr=google:INFO
> 15203 root      20   0 1351436  45308   4088 S   3.3  1.5   0:56.06 python2.7 /home/vagrant/aurora/dist/thermos_observer.pex
--ip=192.168.33.7 --port=1338 --log_to_disk=DEBUG --log_to_stderr=google:INFO
> ```
> 
> 
> Diffs
> -----
> 
>   examples/vagrant/mesos_config/etc_mesos-slave/isolation 1a7028ffc70116b104ef3ad22b7388f637707a0f

>   src/main/python/apache/aurora/executor/thermos_task_runner.py 8f88af4c24ddc603fa12587741af56a6c711e420

>   src/main/python/apache/thermos/core/cgroup.py PRE-CREATION 
>   src/main/python/apache/thermos/core/process.py 4a4678ff39c84cb87836aca19365c5b2aabc4fa4

>   src/main/python/apache/thermos/monitoring/process_collector_cgroup.py PRE-CREATION

>   src/main/python/apache/thermos/monitoring/resource.py 434666696e600a0e6c19edd986c86575539976f2

>   src/main/python/apache/thermos/observer/http/templates/task.tpl f3e06985eb3c05572aa4389d97da575b1179f616

> 
> 
> Diff: https://reviews.apache.org/r/60748/diff/1/
> 
> 
> Testing
> -------
> 
> This patch is mostly a prototype. Note that I had to enable Mesos's cpu and memory isolation.
> 
> Current tests pass. I first want to see how the community feels generally about this
approach, and then I will add additional tests.
> 
> 
> Thanks,
> 
> Reza Motamedi
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message