aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Erb <s...@apache.org>
Subject Re: Review Request 60376: Observer task page to load consumption info from history
Date Tue, 27 Jun 2017 21:10:15 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60376/#review179043
-----------------------------------------------------------



Let's see how this works out in practice! I am really looking forward to snappy thermos pages
again.


src/main/python/apache/thermos/monitoring/resource.py
Line 54 (original), 53 (patched)
<https://reviews.apache.org/r/60376/#comment253446>

    It might be worth to rename this to something like 'AggregateResourceResult' or 'TaskResourceResult'.



src/main/python/apache/thermos/monitoring/resource.py
Line 160 (original), 166 (patched)
<https://reviews.apache.org/r/60376/#comment253448>

    The usages of `ts_value_pair` has made it difficult to understand what is going on. I
believe it becomes a little bit clearer to the casual reader if you do something like this:
    
        timestamp, resources = self._history.get(timestamp)
        
    This will replace usages of `ts_value_pair[1].proc_usage.values()` with `resources.proc_usage.values()`
which makes it slightly clearer.
    
    Calling this `full_resources` instead of `resources` should work as well.



src/main/python/apache/thermos/monitoring/resource.py
Lines 190 (patched)
<https://reviews.apache.org/r/60376/#comment253449>

    in this case you coud do
    
        _, resources = self._history.get(time.time())
        
    
    as you don't care about the timestamp.



src/main/python/apache/thermos/monitoring/resource.py
Line 222 (original), 243-244 (patched)
<https://reviews.apache.org/r/60376/#comment253450>

    We normally don't do hanging indentation https://github.com/twitter/commons/blob/master/src/python/twitter/common/styleguide.md#best-practices



src/test/python/apache/thermos/monitoring/test_resource.py
Lines 84-103 (patched)
<https://reviews.apache.org/r/60376/#comment253451>

    I am not a big fan of the mocking here, even though the sibling testcases are not much
better :). Mocking results in high coupling, making the implementation harder to change in
the future.
    
    Please give it a try if you can manage to inject a normal, pre-populated `ResourceHistory`
into the constructor of the `TaskResourceMonitor` instead.
    
    (If you don't have any conclusive results after 10-15 minutes feel free to give up. There
might be some further interdependency that I am not spotting right now)


- Stephan Erb


On June 22, 2017, 10:18 p.m., Reza Motamedi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60376/
> -----------------------------------------------------------
> 
> (Updated June 22, 2017, 10:18 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, and Santhosh Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> # Observer task page to load consumption info from history
> 
> Resource consumptions of Thermos Processes are periodically calculated by TaskResourceMonitor
threads (one thread per Thermos task). This information is used to display a (semi) fresh
state of the tasks running on a host in the Observer host page, aka landing page. An aggregate
history of the consumptions is kept at the task level, although TaskResourceMonitor needs
to first collect the resource at the Process level and then aggregate them.
> 
> On the other hand, when an Observer _task page_ is visited, the resources consumption
of Thermos Processes within that task are calculated again and displayed without being aggregated.
This can become very slow since time to complete resource calculation is affected by the load
on the host.
> 
> By applying this patch we take advantage of the periodic work and fulfill information
resource requested in Observer task page from already collected resource consumptions.
> 
> 
> Diffs
> -----
> 
>   src/main/python/apache/thermos/monitoring/resource.py 434666696e600a0e6c19edd986c86575539976f2

>   src/test/python/apache/thermos/monitoring/test_resource.py d794a998f1d9fc52ba260cd31ac444aee7f8ed28

> 
> 
> Diff: https://reviews.apache.org/r/60376/diff/1/
> 
> 
> Testing
> -------
> 
> I stress tested this patch on a host that had a slow Observer page. Interestingly, I
did not need to do much to make the Observer slow. There are a few points to be made clear
first.
> - We at Twitter limit the resources allocated to the Observer using `systemd`. The observer
is allowed to use only 20% of a CPU core. The attached screen shots are from such a setup.
> - Having assigned 20% of a cpu core to Observer, starting only 8 `task`s, each with 3
`process`es is enough to make the Observer slow; 11secs to load `task page`.
> 
> 
> File Attachments
> ----------------
> 
> without the patch -- Screen Shot 2017-06-22 at 1.11.12 PM.png
>   https://reviews.apache.org/media/uploaded/files/2017/06/22/03968028-a2f5-4a99-ba57-b7a41c471436__without_the_patch_--_Screen_Shot_2017-06-22_at_1.11.12_PM.png
> with the patch -- Screen Shot 2017-06-22 at 1.07.41 PM.png
>   https://reviews.apache.org/media/uploaded/files/2017/06/22/5962c018-27d3-4463-a277-f6ad48b7f2d7__with_the_patch_--_Screen_Shot_2017-06-22_at_1.07.41_PM.png
> 
> 
> Thanks,
> 
> Reza Motamedi
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message