hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3748) Flag to make tasks to send counter information only at the end of the task
Date Wed, 27 Aug 2008 16:19:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626171#action_12626171
] 

Alejandro Abdelnur commented on HADOOP-3748:
--------------------------------------------

We've done some tests having a jobs running for 30 mins utilizing all task slots in the cluster
(8 task per slave). We did the runs using 0 counters and runs using 200 counters (the same
jobs, just enabling the counters via a configuration property).

8 nodes cluster: no noticeable increase in avg network pgk_in traffic in the JT box when using
counters.
100 nodes cluster: 28% increase in avg network pgk_in traffic in the JT box when using counters.
200 nodes cluster: 60% increase in avg network pgk_in traffic in the JT box when using counters
(with 10x peeks).

As we suspected, the greater the number of nodes the higher the impact of using large number
of counters will be.Thus the proposed aproach.

We want to leverage Hadoop counters as Hadoop takes care of aggregating and reporting them,
plus in the case of task failures they are kept consistently. We are tracking with them what
we do with records in the processing.




> Flag to make tasks to send counter information only at the end of the task
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-3748
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3748
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>         Environment: all
>            Reporter: Alejandro Abdelnur
>
> Currently counters are streaming from the task to the jobtracker as the task progresses.
If the number of counters is large this has a significant impact on the network traffic as
well as in the JobTracker load.
> The should be a flag, for example by counter-group, that indicates that the counters
are to be reported at the end of the task. By default this flag should be set to false for
all counter-groups maintaining the current behavior.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message