giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-810) Giraph should track aggregate statistics over lifetime of the computation
Date Wed, 08 Jan 2014 20:28:53 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865856#comment-13865856
] 

Hudson commented on GIRAPH-810:
-------------------------------

SUCCESS: Integrated in Giraph-trunk-Commit #1381 (See [https://builds.apache.org/job/Giraph-trunk-Commit/1381/])
GIRAPH-810: Giraph should track aggregate statistics over lifetime of the computation (rvesse
via majakabiljo) (majakabiljo: http://git-wip-us.apache.org/repos/asf?p=giraph.git&a=commit&h=20f8df00e71a0061de820f0973f7fa7b62086afb)
* giraph-core/src/main/java/org/apache/giraph/counters/GiraphStats.java
* giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java
* CHANGELOG


> Giraph should track aggregate statistics over lifetime of the computation
> -------------------------------------------------------------------------
>
>                 Key: GIRAPH-810
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-810
>             Project: Giraph
>          Issue Type: Improvement
>    Affects Versions: 1.1.0
>            Reporter: Rob Vesse
>             Fix For: 1.1.0
>
>         Attachments: GIRAPH-810.patch
>
>
> When Giraph completes a job it reports a set of information about the job like so:
> {noformat}
> Giraph Timers
> 2013-12-04 10:43:45,570 INFO org.apache.hadoop.mapred.JobClient (main):     Superstep
3 TriangleFindingComputation (ms)=102234
> 2013-12-04 10:43:45,570 INFO org.apache.hadoop.mapred.JobClient (main):     Superstep
2 TriangleFindingComputation (ms)=29419
> 2013-12-04 10:43:45,570 INFO org.apache.hadoop.mapred.JobClient (main):     Superstep
1 TriangleFindingComputation (ms)=34397
> 2013-12-04 10:43:45,570 INFO org.apache.hadoop.mapred.JobClient (main):     Input superstep
(ms)=12642
> 2013-12-04 10:43:45,570 INFO org.apache.hadoop.mapred.JobClient (main):     Total (ms)=208962
> 2013-12-04 10:43:45,570 INFO org.apache.hadoop.mapred.JobClient (main):     Superstep
0 TriangleFindingComputation (ms)=4201
> 2013-12-04 10:43:45,570 INFO org.apache.hadoop.mapred.JobClient (main):     Shutdown
(ms)=2698
> 2013-12-04 10:43:45,570 INFO org.apache.hadoop.mapred.JobClient (main):     Setup (ms)=23351
> 2013-12-04 10:43:45,571 INFO org.apache.hadoop.mapred.JobClient (main):   Zookeeper server:port
> 2013-12-04 10:43:45,571 INFO org.apache.hadoop.mapred.JobClient (main):     ip-10-145-221-220.ec2.internal:22181=0
> 2013-12-04 10:43:45,571 INFO org.apache.hadoop.mapred.JobClient (main):   Giraph Stats
> 2013-12-04 10:43:45,571 INFO org.apache.hadoop.mapred.JobClient (main):     Aggregate
edges=150000
> 2013-12-04 10:43:45,571 INFO org.apache.hadoop.mapred.JobClient (main):     Sent message
bytes=0
> 2013-12-04 10:43:45,571 INFO org.apache.hadoop.mapred.JobClient (main):     Superstep=4
> 2013-12-04 10:43:45,571 INFO org.apache.hadoop.mapred.JobClient (main):     Last checkpointed
superstep=0
> 2013-12-04 10:43:45,571 INFO org.apache.hadoop.mapred.JobClient (main):     Current workers=16
> 2013-12-04 10:43:45,571 INFO org.apache.hadoop.mapred.JobClient (main):     Current master
task partition=0
> 2013-12-04 10:43:45,571 INFO org.apache.hadoop.mapred.JobClient (main):     Sent messages=0
> 2013-12-04 10:43:45,571 INFO org.apache.hadoop.mapred.JobClient (main):     Aggregate
finished vertices=1000
> 2013-12-04 10:43:45,571 INFO org.apache.hadoop.mapred.JobClient (main):     Aggregate
vertices=1000
> {noformat}
> The problem is that some of this statistics are not particularly helpful since they pertain
only to the most recent super step, namely Sent messages and Sent  messages bytes.
> I can understand that there is a reason for doing this since the number of sent messages
is used in helping to determine whether a computation should halt at a given super step but
it would be useful if these were also tracked in aggregate over the lifetime of the computation.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message