hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sharad Agarwal (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5931) Collect information about number of tasks succeeded / total per time unit for a tasktracker.
Date Thu, 04 Jun 2009 10:27:07 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716222#action_12716222
] 

Sharad Agarwal commented on HADOOP-5931:
----------------------------------------

To collect stats for last hour/day, we can have a moving window for that time period. A moving
window can contain multiple time slots. The granularity of window movement/update is decided
by the slot size. The slot size could be different for different time windows. For example,
hour window could have 5 minutes, day window could have 1 hour update granularity. So in that
case hour window would hold stats in 12 slots of 5 mins each. Likewise day window would hold
stats in 24 slots of 1 hour each.

As the last slot time is crossed, a new slot would be added and the very first one would be
knocked off. Hence moving the window by one slot.

A simple strategy could be to collect this information in TaskTracker and report that to JobTracker
via TaskTrackerStatus. A subclass could be added to TaskTrackerStatus with fields, say:
tasksSinceStarted, tasksSuccededSinceStarted,
tasksSinceInLastHour, tasksSuccededInLastHour,
tasksSinceInLastDay, tasksSuccededInLastDay

To optimize on heartbeat size, we need not send the above fields with every heartbeat. This
could be reported only at certain interval (typically the minimum slot size, 5 mins in above
example).

An alternate way could be to compute all this in JobTracker. My vote goes for doing it in
Tasktracker as this is mostly to do with individual Task tracker and doesn't need any global
information.

Thoughts?


> Collect information about number of tasks succeeded / total per time unit for a tasktracker.

> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5931
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5931
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Hemanth Yamijala
>
> Collecting information of number of tasks succeeded / total per tasktracker and being
able to see these counts per hour, day and since start time will help reason about things
like the blacklisting strategy.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message