hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3130) Shuffling takes too long to get the last map output.
Date Mon, 31 Mar 2008 06:40:24 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12583568#action_12583568

Amar Kamat commented on HADOOP-3130:

It seems that the log info is the main cause of confusion. This is what we think has happened
as per the logs
1) The reducer gets the task completion event for a bunch of maps and schedules them.
2) All the map outputs get successfully copied except one.
3) Assume that the jetty that was supposed to serve the remaining map's output is busy.
4) After 3 mins the attempt fails, gets retried and succeeds. 3min is the timeout for a fetch
This also explains the 2 min wait mentioned above. In the first 1 min other map outputs are
fetched (i.e overlapped). In the remaining 2 mins (before timeout) the reducer is just waiting
for the last map's output. The '*need 1 map output*' info in the reducers logs should also
mention how many of them are in progress.

> Shuffling takes too long to get the last map output.
> ----------------------------------------------------
>                 Key: HADOOP-3130
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3130
>             Project: Hadoop Core
>          Issue Type: Bug
>            Reporter: Runping Qi
>         Attachments: shuffling.log
> I noticed that towards the end of shufflling, the map output fetcher of the reducer backs
off too aggressively.
> I attach a fraction of one reduce log of my job.
> Noticed that the last map output was not fetched in 2 minutes.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message