aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aurora ReviewBot <wfar...@apache.org>
Subject Re: Review Request 61016: lock psutil's oneshot
Date Sat, 22 Jul 2017 06:24:47 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61016/#review181170
-----------------------------------------------------------



Master (8f5a591) is green with this patch.
  ./build-support/jenkins/build.sh

However, it appears that it might lack test coverage.

I will refresh this build result if you post a review containing "@ReviewBot retry"

- Aurora ReviewBot


On July 21, 2017, 9:12 p.m., Reza Motamedi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61016/
> -----------------------------------------------------------
> 
> (Updated July 21, 2017, 9:12 p.m.)
> 
> 
> Review request for Aurora, Santhosh Kumar Shanmugham and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> # lock psutil's oneshot
> 
> TLDR; psutil's `oneshot` is not threadsafe.
> 
> After a lot of testing on busy machines, I realized that psutil's oneshot is not threadsafe.
I contanced the developer however, have not recevied a conceret fix.
> 
> Please read https://issues.apache.org/jira/browse/AURORA-1939 and https://github.com/giampaolo/psutil/issues/1110
for more information.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/thermos/monitoring/process_collector_psutil.py 3594955c68b45ab65c01426ba0a18ec8a132a27f

> 
> 
> Diff: https://reviews.apache.org/r/61016/diff/1/
> 
> 
> Testing
> -------
> 
> The following test is done by adding additional logging in the current code:
> 
> 
> ```
> ... 
>      cpu_times = process.cpu_times()
> +    log.debug("process:{} cpu times {}".format(process, cpu_times))
>      user, system = cpu_times.user, cpu_times.system
>      memory_info = p
> ...      
> ```
> 
> ```
> $ grep '36350' thermos-observer.XXXX.prod.twttr.net.root.log.DEBUG.20170721-163950.9421
> D0721 16:55:28.242974 9421 process_collector_psutil.py:40] process:psutil.Process(pid=36350,
name='mesos-slave') cpu times pcputimes(user=2500.95, system=4487.06, children_user=0.0, children_system=0.0)
> D0721 17:11:21.940462 9421 process_collector_psutil.py:40] process:psutil.Process(pid=36350,
name='bash') cpu times pcputimes(user=0.0, system=0.03, children_user=0.0, children_system=0.0)
> D0721 17:11:22.247414 9421 process_collector_psutil.py:111] Calculated rate for pid=34339
and children: -7.32560348996 (old: 6988.040000, new: 0.060000) {34339: 1498166704.32, 36350:
1498166720.51} -> {34339: 1498166704.32, 36350: 1498166720.51} [{34339: ProcessSample(rate=0.0,
user=0.0, system=0.03, rss=2777088, vms=11919360, nice=0, status='sleeping', threads=1), 36350:
ProcessSample(rate=0.0, user=2500.95, system=4487.06, rss=41906176, vms=1601019904, nice=0,
status='sleeping', threads=20)}] [{34339: ProcessSample(rate=0.0, user=0.0, system=0.03, rss=2777088,
vms=11919360, nice=0, status='sleeping', threads=1), 36350: ProcessSample(rate=0.0, user=0.0,
system=0.03, rss=41906176, vms=1601019904, nice=0, status='sleeping', threads=20)}]
> ```
> 
> These inconsistencies disappear after removing oneshot.
> 
> 
> Thanks,
> 
> Reza Motamedi
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message