hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3297) The way in which ReduceTask/TaskTracker gets completion events during shuffle can be improved
Date Mon, 28 Apr 2008 13:32:55 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592806#action_12592806
] 

Devaraj Das commented on HADOOP-3297:
-------------------------------------

Here is a proposal after a discussion with Sameer:
1) The TaskTracker polls the JobTracker asking for 500 task completion events. If it gets
the full payload, it immediately asks for another bunch of 500 and so on. When it gets less
than 500, it switches to current behavior - sleep for a fixed amount of time (heartbeat interval).
A small number of events per RPC would ensure that each RPC takes a lesser amount of time
although the number of RPCs would be more.
2) The Task asks for 10000 events at a time every second from the TaskTracker.

> The way in which ReduceTask/TaskTracker gets completion events during shuffle can be
improved
> ---------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3297
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3297
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>            Assignee: Devaraj Das
>             Fix For: 0.18.0
>
>
> Certain things like poll frequency, number of events fetched in one go, etc. can probably
be improved to improve the shuffle performance. This would affect the task-->tasktracker
and the tasktracker-->jobtracker shuffle related RPCs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message