hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2980) slow reduce copies - map output locations not being fetched even when map complete
Date Sun, 09 Mar 2008 11:27:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12576752#action_12576752
] 

Devaraj Das commented on HADOOP-2980:
-------------------------------------

There is a fixed delay between two consecutive polls to the JobTracker. But as of 0.16, the
way it works is if tasks run out of map output locations, the tasktracker polls the JT (minimum
delay between two consecutive polls is set as 5 secs).

> slow reduce copies - map output locations not being fetched even when map complete
> ----------------------------------------------------------------------------------
>
>                 Key: HADOOP-2980
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2980
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.15.3
>            Reporter: Joydeep Sen Sarma
>
> maps are long finished. reduces are stuck looking for map locations. they make progress
- but slowly. it almost seems like they get new map locations every minute or so:
> 2008-03-07 18:50:52,737 INFO org.apache.hadoop.mapred.ReduceTask: task_200803041231_3586_r_000021_0
done copying task_200803041231_3586_m_004620_0 output from hadoop082.sf2p.facebook.com..
> 2008-03-07 18:50:53,733 INFO org.apache.hadoop.mapred.ReduceTask: task_200803041231_3586_r_000021_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-03-07 18:50:53,733 INFO org.apache.hadoop.mapred.ReduceTask: task_200803041231_3586_r_000021_0
Got 0 known map output location(s); scheduling...
> ...
> 2008-03-07 18:51:49,767 INFO org.apache.hadoop.mapred.ReduceTask: task_200803041231_3586_r_000021_0
Got 50 known map output location(s); scheduling...
> 2008-03-07 18:51:49,767 INFO org.apache.hadoop.mapred.ReduceTask: task_200803041231_3586_r_000021_0
Scheduled 41 of 50 known outputs (0 slow hosts and 9 dup hosts)
> they get about 50 locations at a time and this 1 minute delay pattern is surprisingly
common ..

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message