hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1339) Shuffle should be refactored to a separate task by itself
Date Sun, 02 Mar 2008 05:36:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12574190#action_12574190

Amar Kamat commented on HADOOP-1339:

Does it makes sense to spawn a thread from the task tracker rather than a separate jvm? The
reason being that the shuffle code is again a framework code. 

> Shuffle should be refactored to a separate task by itself
> ---------------------------------------------------------
>                 Key: HADOOP-1339
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1339
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Devaraj Das
> Currently, shuffle phase is part of the reduce task. The idea here is to move out the
shuffle as a first-class task. This will improve the usage of the network since we will then
be able to schedule shuffle tasks independently, and later on pin reduce tasks to those nodes.
This will make most sense for apps where there are multiple waves of reduces (the second wave
of reduces can directly start off doing the "reducer" phase).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message