hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (HADOOP-3155) reducers stuck at shuffling
Date Thu, 09 Oct 2008 06:38:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638197#action_12638197
] 

zshao edited comment on HADOOP-3155 at 10/8/08 11:37 PM:
--------------------------------------------------------------

@Joydeep: do you mean the following piece of code in waitForProxy function?
{code}
      try {
        Thread.sleep(1000);
      } catch (InterruptedException ie) {
        // IGNORE
      }
{code}

waitForProxy is called by the main thread (that calls TaskTracker.initialize()). It's not
called by the FetchThread. So FetchThread will never ignore InterruptedException, correct?


      was (Author: zshao):
    @Joydeep: do you mean the following piece of code in waitForProxy function?
{code}
      try {
        Thread.sleep(1000);
      } catch (InterruptedException ie) {
        // IGNORE
      }

waitForProxy is called by the main thread (that calls TaskTracker.initialize()). It's not
called by the FetchThread. So FetchThread will never ignore InterruptedException, correct?

  
> reducers stuck at shuffling 
> ----------------------------
>
>                 Key: HADOOP-3155
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3155
>             Project: Hadoop Core
>          Issue Type: Bug
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.19.0
>
>         Attachments: events-job_200807311630_0007.txt, fetcherThread.patch, fetcherThread.patch,
hadoop-3155-logs.tar.gz, jobevents_1007.txt, patch-3155-debug-0.16.txt, patch-3155-debug-0.17.txt,
task_200807311630_0007_r_000006_0.syslog.gz
>
>
> This happened with hadoop-0.16.2:
> In relatively small job (a few hundreds of mappers and reducers), reducers were stuck
at shuffling.
> I saw the lines like the following repeated hundreds of thousands of times over a few
hours:
> 2008-04-02 17:17:44,640 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:17:44,641 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:17:44,641 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:17:44,641 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:17:46,643 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:17:46,643 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:17:46,643 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:17:46,643 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:17:48,645 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:17:48,645 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:17:48,645 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:17:48,645 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:17:50,647 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:17:50,647 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:17:50,647 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:17:50,647 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:17:52,649 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:17:52,650 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:17:52,650 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:17:52,650 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:17:54,651 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:17:54,652 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:17:54,652 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:17:54,652 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:17:56,654 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:17:56,654 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:17:56,654 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:17:56,654 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:17:58,656 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:17:58,656 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:17:58,656 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:17:58,656 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:18:00,658 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:18:00,658 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:18:00,658 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:18:00,658 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:18:02,660 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:18:02,661 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:18:02,661 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:18:02,661 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:18:04,662 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:18:04,663 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:18:04,663 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:18:04,663 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:18:06,664 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:18:06,665 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:18:06,665 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:18:06,665 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:18:08,667 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:18:08,667 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:18:08,667 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:18:08,667 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:18:10,669 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:18:10,669 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:18:10,669 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:18:10,669 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:18:12,671 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:18:12,671 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:18:12,671 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:18:12,671 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:18:14,673 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:18:14,674 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:18:14,674 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:18:14,674 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:18:16,675 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:18:16,676 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:18:16,676 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:18:16,676 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:18:18,678 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:18:18,678 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:18:18,678 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:18:18,678 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:18:20,680 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:18:20,680 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:18:20,680 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...
> 2008-04-02 17:18:20,680 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts)
> 2008-04-02 17:18:22,682 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Need 2 map output(s)
> 2008-04-02 17:18:22,682 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0:
Got 0 new map-outputs & 0 obsolete map-outputs from tasktracker and 0 map-outputs from
previous failures
> 2008-04-02 17:18:22,682 INFO org.apache.hadoop.mapred.ReduceTask: task_200804021200_0337_r_000008_0
Got 0 known map output location(s); scheduling...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message