Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 38293 invoked from network); 27 Jun 2008 20:18:38 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Jun 2008 20:18:38 -0000 Received: (qmail 40899 invoked by uid 500); 27 Jun 2008 20:18:37 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 40887 invoked by uid 500); 27 Jun 2008 20:18:37 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 40875 invoked by uid 99); 27 Jun 2008 20:18:37 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Jun 2008 13:18:37 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 Jun 2008 20:17:55 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 2A988234C158 for ; Fri, 27 Jun 2008 13:17:45 -0700 (PDT) Message-ID: <575837852.1214597865173.JavaMail.jira@brutus> Date: Fri, 27 Jun 2008 13:17:45 -0700 (PDT) From: "Arun C Murthy (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Updated: (HADOOP-3604) Reduce stuck at shuffling phase In-Reply-To: <1842401772.1213923707171.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-3604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated HADOOP-3604: ---------------------------------- Status: Patch Available (was: Open) > Reduce stuck at shuffling phase > ------------------------------- > > Key: HADOOP-3604 > URL: https://issues.apache.org/jira/browse/HADOOP-3604 > Project: Hadoop Core > Issue Type: Bug > Components: mapred > Affects Versions: 0.18.0 > Reporter: Runping Qi > Assignee: Arun C Murthy > Priority: Blocker > Fix For: 0.18.0 > > Attachments: HADOOP-3604_0_20080623.patch, HADOOP-3604_1_20080624.patch, HADOOP-3604_1_20080624.patch, HADOOP-3604_2_20080625.patch, HADOOP-3604_3_20080627.patch, stack.txt > > > I was running gridmix with Hadoop 0.18. > I set the map output compression to true. > Most of the jobs completed just fine. > Three jobs, however, got stuck. > Each has one reducer stuck at shuffling phase. > Here is the log: > 2008-06-20 00:06:01,264 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=SHUFFLE, sessionId= > 2008-06-20 00:06:01,415 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed exec [/bin/cat] > 2008-06-20 00:06:01,463 INFO org.apache.hadoop.mapred.ReduceTask: ShuffleRamManager: MemoryLimit=134217728, MaxSingleShuffleLimit=33554432 > 2008-06-20 00:06:01,474 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library > 2008-06-20 00:06:01,475 INFO org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded & initialized native-zlib library > 2008-06-20 00:06:01,476 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,477 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,477 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,478 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,478 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,486 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,486 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,487 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,487 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,488 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,488 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,489 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,489 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,489 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,493 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,496 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,496 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,496 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,497 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,497 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new decompressor > 2008-06-20 00:06:01,500 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200806192318_0450_r_000016_0 Thread started: Thread for merging on-disk files > 2008-06-20 00:06:01,500 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200806192318_0450_r_000016_0 Thread waiting: Thread for merging on-disk files > 2008-06-20 00:06:01,502 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200806192318_0450_r_000016_0 Need another 270 map output(s) where 0 is already in progress > 2008-06-20 00:06:01,503 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200806192318_0450_r_000016_0 Thread started: Thread for merging in memory files > 2008-06-20 00:06:01,503 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200806192318_0450_r_000016_0: Got 0 new map-outputs & number of known map outputs is 0 > 2008-06-20 00:06:01,504 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200806192318_0450_r_000016_0 Scheduled 0 of 0 known outputs (0 slow hosts and 0 dup hosts) > 2008-06-20 00:06:06,654 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200806192318_0450_r_000016_0: Got 269 new map-outputs & number of known map outputs is 269 > 2008-06-20 00:06:06,656 INFO org.apache.hadoop.mapred.ReduceTask: attempt_200806192318_0450_r_000016_0 Scheduled 229 of 269 known outputs (0 slow hosts and 40 dup hosts) > 2008-06-20 00:06:07,163 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 846183 bytes (210104 raw bytes) into RAM-FS from attempt_200806192318_0450_m_000089_0 > 2008-06-20 00:06:07,163 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 820890 bytes (204371 raw bytes) into RAM-FS from attempt_200806192318_0450_m_000083_0 > 2008-06-20 00:06:07,166 INFO org.apache.hadoop.mapred.ReduceTask: Shuffling 835672 bytes (208085 raw bytes) into RAM-FS from attempt_200806192318_0450_m_000122_0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.