mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dominic Hamon" <dha...@twopensource.com>
Subject Re: Review Request 18718: Added simple metrics API.
Date Tue, 25 Mar 2014 17:45:29 GMT


> On March 24, 2014, 6:59 p.m., Ben Mahler wrote:
> > Looking pretty good, can you do a style audit on this patch? I see members named
with "_" suffixes, "// = delete" comments, unnamed namespaces, etc.

done


> On March 24, 2014, 6:59 p.m., Ben Mahler wrote:
> > 3rdparty/libprocess/include/process/metrics/counter.hpp, line 19
> > <https://reviews.apache.org/r/18718/diff/11/?file=534291#file534291line19>
> >
> >     Let's kill Counter::set for now, what is the use-case for directly setting a
Counter?

slave statistics include recoveryErrors which is best captured by a counter that is directly
set.


> On March 24, 2014, 6:59 p.m., Ben Mahler wrote:
> > 3rdparty/libprocess/include/process/metrics/gauge.hpp, line 24
> > <https://reviews.apache.org/r/18718/diff/11/?file=534292#file534292line24>
> >
> >     We really should not be blocking implicitly when we call Gauge::get.
> >     
> >     I see two paths forward here:
> >     
> >     1. Metric::get returns a Future<double> and in any code you write that
reads metrics, it will need to collect() the values from the metrics.
> >     
> >     2. Alternatively, split the logic for fetching data and reading fetched data.
Ignore naming for a second, something like 'Future<double> Metric::fetch' for fetching
and caching the data asynchronously, and 'double Metric::fetched' for reading the cached fetched
data synchronously.
> >     
> >     I only mention 2 because it may be a bit too cumbersome to do reading if we
go with 1, because you will need to build up a bit of structure (mapping of context,name ->
Future<double>) in your code to pass to the continuation function so that it may read
the data easily.
> >     
> >     With 2 however, you could simply collect all the fetch() calls and have the
continuation read through the fetched() data.
> >     
> >     But, I don't think the code would be too complex for 1 so I would recommend
going that route so that we can keep the API as clean as possible here.
> 
> Ben Mahler wrote:
>     Let me give a little more motivation as to why we don't block like this, since the
pitfalls are not spelled out clearly anywhere.
>     
>     When you block a 'Process', the corresponding libprocess worker thread will block.
>     There are a limited number of worker threads, say N, where N ~ number of logical
cores on the machine.
>     Now, if you have M > N Processes running, and N of these end up in a blocked state,
then libprocess is deadlocked.
>     
>     In the case of Gauge, this is especially dangerous since the blocking is implicitly
done so it's not clear to the caller that the call to get() is a blocking operation.

Agreed - I'll change to a Future<double> return and will fix up the collection points.


- Dominic


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18718/#review38400
-----------------------------------------------------------


On March 24, 2014, 12:51 p.m., Dominic Hamon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18718/
> -----------------------------------------------------------
> 
> (Updated March 24, 2014, 12:51 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman and Ben Mahler.
> 
> 
> Bugs: MESOS-1036
>     https://issues.apache.org/jira/browse/MESOS-1036
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> See summary
> 
> 
> Diffs
> -----
> 
>   3rdparty/libprocess/Makefile.am 3c6219eb6e76306463b3710ab7e50ec8b75d3d76 
>   3rdparty/libprocess/include/process/metrics/counter.hpp PRE-CREATION 
>   3rdparty/libprocess/include/process/metrics/gauge.hpp PRE-CREATION 
>   3rdparty/libprocess/include/process/metrics/metric.hpp PRE-CREATION 
>   3rdparty/libprocess/src/metrics/metric.cpp PRE-CREATION 
>   3rdparty/libprocess/src/metrics/metrics.hpp PRE-CREATION 
>   3rdparty/libprocess/src/metrics/metrics.cpp PRE-CREATION 
>   3rdparty/libprocess/src/process.cpp 6c6acc03ad78779e8f25b1a4584069fd377f647b 
>   3rdparty/libprocess/src/tests/metrics_tests.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/18718/diff/
> 
> 
> Testing
> -------
> 
> Added unit tests. make check.
> 
> 
> Thanks,
> 
> Dominic Hamon
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message