giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nitay Joffe" <ni...@apache.org>
Subject Re: Review Request: GIRAPH-421: Aggregate metrics up to master
Date Fri, 16 Nov 2012 20:18:34 GMT


> On Nov. 16, 2012, 7:41 p.m., Alessandro Presta wrote:
> > Nice!
> > Regarding the waiting time issue, any idea why workers currently send stats before
waiting?

Yeah it's because I made the stats a part of the finishedSuperstep() stuff we write to ZK.
That's the question I was asking - should I just do another write to ZK after the wait()?


- Nitay


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8042/#review13524
-----------------------------------------------------------


On Nov. 13, 2012, 10:02 p.m., Nitay Joffe wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8042/
> -----------------------------------------------------------
> 
> (Updated Nov. 13, 2012, 10:02 p.m.)
> 
> 
> Review request for giraph.
> 
> 
> Description
> -------
> 
> https://issues.apache.org/jira/browse/GIRAPH-421
> 
> The workers send their metrics to the master by writing them to the ZK node that is already
used for FinishedSuperstepStats.
> The master reads each worker's metrics, aggregates, and prints an overall summary.
> 
> Worker:
> https://gist.github.com/23edefe0c5bbbfd25f93
> Prints a summary at the end of each superstep. At the end of the job it does a raw dump
of all of the metrics. 
> 
> Master:
> https://gist.github.com/4379c15efa8173d96163
> At end of each superstep it gathers the metrics and computes mean, min, and max. Anything
else we should compute?
> There is also support for the master to track and print its own metrics, but this is
not really used right now.
> There is a known "bug /feature" that aggregating the waiting time doesn't work. This
is because waiting for other workers happens _after_ a worker writes its own stats. The easiest
solution is for me to just write the metrics to ZK (using a different node or overriding the
existing one) again after the wait is done. I haven't done this yet because I want to hear
you guys' thoughts - perhaps there' a better way to do this.
> 
> As a part of this diff I also cleaned some things up, namely:
> - Just one option now to use, called giraph.metrics.enable, which toggles everything.
When false (default), no work is done. When enabled, every worker and the master tracks metrics.
They are aggregated and printed as in examples below.
> - Changed metrics that are often small values like "time to first message" and "waiting
time" to be in microseconds instead of milliseconds.
> 
> 
> Diffs
> -----
> 
>   /trunk/giraph/src/main/java/org/apache/giraph/GiraphConfiguration.java 1408489 
>   /trunk/giraph/src/main/java/org/apache/giraph/graph/BspService.java 1408489 
>   /trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceMaster.java 1408489 
>   /trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceWorker.java 1408489 
>   /trunk/giraph/src/main/java/org/apache/giraph/graph/GraphMapper.java 1408489 
>   /trunk/giraph/src/main/java/org/apache/giraph/metrics/AggregatedMetric.java PRE-CREATION

>   /trunk/giraph/src/main/java/org/apache/giraph/metrics/AggregatedMetrics.java PRE-CREATION

>   /trunk/giraph/src/main/java/org/apache/giraph/metrics/GiraphMetrics.java 1408489 
>   /trunk/giraph/src/main/java/org/apache/giraph/metrics/GiraphMetricsRegistry.java 1408489

>   /trunk/giraph/src/main/java/org/apache/giraph/metrics/MetricsRegistryDebugger.java
PRE-CREATION 
>   /trunk/giraph/src/main/java/org/apache/giraph/metrics/SuperstepMetricsRegistry.java
1408489 
>   /trunk/giraph/src/main/java/org/apache/giraph/metrics/ValueWithHostname.java PRE-CREATION

>   /trunk/giraph/src/main/java/org/apache/giraph/metrics/WorkerSuperstepMetrics.java PRE-CREATION

>   /trunk/giraph/src/main/java/org/apache/giraph/utils/FakeTime.java 1408489 
>   /trunk/giraph/src/main/java/org/apache/giraph/utils/SystemTime.java 1408489 
>   /trunk/giraph/src/main/java/org/apache/giraph/utils/Time.java 1408489 
>   /trunk/giraph/src/main/java/org/apache/giraph/utils/Times.java 1408489 
>   /trunk/giraph/src/main/java/org/apache/giraph/utils/WritableUtils.java 1408489 
> 
> Diff: https://reviews.apache.org/r/8042/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Nitay Joffe
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message