Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 12105 invoked from network); 29 Jun 2010 15:13:20 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 29 Jun 2010 15:13:20 -0000 Received: (qmail 17023 invoked by uid 500); 29 Jun 2010 15:13:17 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 16893 invoked by uid 500); 29 Jun 2010 15:13:16 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 16885 invoked by uid 99); 29 Jun 2010 15:13:16 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Jun 2010 15:13:16 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of siddharth.karandikar@gmail.com designates 209.85.214.176 as permitted sender) Received: from [209.85.214.176] (HELO mail-iw0-f176.google.com) (209.85.214.176) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Jun 2010 15:13:09 +0000 Received: by iwn38 with SMTP id 38so8105050iwn.35 for ; Tue, 29 Jun 2010 08:11:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=zDqkmmlew3C1ZVZm176xf0DuqhpSmP/iD48TsjR9FtE=; b=ZaEfc9YwZgnDT64LlDtaEGdXbcAcPTi3NZwgEfPQJ3wD3OPBdy1SvDxgOSvi8KK9Sf MckCKbqxLNEwY0xCJciMq7ViXdIzVFYpS0yvfGQ6bbeh0TWA1pOo/NFeQAwdbqFJVfoW VltzWLXu9Hnrvs65xED2XJ3n+Um8P2lEnmZgg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=IX/o04F79gRBUTzG5Ac4DItdrO+BU8QuM3CNjFoo/gZbZ+jvluZXpA7UUgHUx4Drp0 /HrJhtrzU+isw0E4OcrzzIaFPJI/1pG8VyxwWoXxX5af3Xfnqq9psHd5uzvihHJVhypM E5BWv1nsX5NGmC+57cJ8g9DfOI4Ejey8QOh/E= MIME-Version: 1.0 Received: by 10.231.50.145 with SMTP id z17mr6389825ibf.148.1277824308073; Tue, 29 Jun 2010 08:11:48 -0700 (PDT) Received: by 10.231.59.21 with HTTP; Tue, 29 Jun 2010 08:11:47 -0700 (PDT) Date: Tue, 29 Jun 2010 20:41:47 +0530 Message-ID: Subject: newbie - job failing at reduce From: Siddharth Karandikar To: common-user@hadoop.apache.org Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org Hi All, I am new to Hadoop, but by reading online docs and other resource, I have moved ahead and now trying to run a cluster of 3 nodes. Before doing this, tried my program on standalone and pseudo systems and thats working fine. Now the issue that I am facing - mapping phase works correctly. While doing reduce, I am seeing following error on one of the nodes - 2010-06-29 14:35:01,848 WARN org.apache.hadoop.mapred.TaskTracker: getMapOutput(attempt_201006291958_0001_m_000008_0,0) failed : org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_000008_0/output/file.out.index in any of the configured local directories Lets say this is @ Node1. But there is no such directory named 'taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_000008_0' under /tmp/mapred/local/taskTracker/ on Node1. Interestingly, this directory is available on Node2 (or Node3). Tried running the job multiple times, but its always failing while reducing. Same error. I have configured /tmp/mapred/local on each node from mapred-site.xml. I really don't understand why mappers are misplacing these files? Or am I missing something in configuration? If someone wants to look @ configurations, I have pasted that below. Thanks, Siddharth Configurations ========== conf/core-site.xml --------------------------- fs.default.name hdfs://192.168.2.115/ conf/hdfs-site.xml -------------------------- fs.default.name hdfs://192.168.2.115 dfs.data.dir /home/siddharth/hdfs/data dfs.name.dir /home/siddharth/hdfs/name dfs.replication 3 conf/mapred-site.xml ------------------------------ mapred.job.tracker 192.168.2.115:8021 mapred.local.dir /tmp/mapred/local true mapred.system.dir hdfs://192.168.2.115/maperdsystem true mapred.tasktracker.map.tasks.maximum 4 true mapred.tasktracker.reduce.tasks.maximum 4 true mapred.child.java.opts -Xmx512m