hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3130) Shuffling takes too long to get the last map output.
Date Tue, 01 Apr 2008 13:35:25 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12584135#action_12584135
] 

Devaraj Das commented on HADOOP-3130:
-------------------------------------

I think it makes sense from the utilization point of view to have a smaller timeout. We free
up a thread sooner and it can potentially successfully fetch from some other host. This needs
to be benchmarked. But it also means that we need to keep an eye on the self-healing aspect
- we kill reducers after they fail to fetch for a certain number of times (and connection
establishment failure is a sign of failure currently). We might end up killing reducers sooner
than we do it today. 
[For killing reducers, we probably should move to a model where we look at the global picture
and use all information before killing a reducer (move this logic entirely to the JobTracker).
So in the case of map output fetch failures the JT can decide whether to kill a reducer or
not based on which map outputs the reducer is failing to fetch, and, whether those map nodes
are healthy, etc.]

> Shuffling takes too long to get the last map output.
> ----------------------------------------------------
>
>                 Key: HADOOP-3130
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3130
>             Project: Hadoop Core
>          Issue Type: Bug
>            Reporter: Runping Qi
>         Attachments: HADOOP-3130.patch, shuffling.log
>
>
> I noticed that towards the end of shufflling, the map output fetcher of the reducer backs
off too aggressively.
> I attach a fraction of one reduce log of my job.
> Noticed that the last map output was not fetched in 2 minutes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message