Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 22382 invoked from network); 30 Jun 2010 10:04:01 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 30 Jun 2010 10:04:01 -0000 Received: (qmail 3595 invoked by uid 500); 30 Jun 2010 10:03:59 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 3356 invoked by uid 500); 30 Jun 2010 10:03:55 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 3348 invoked by uid 99); 30 Jun 2010 10:03:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Jun 2010 10:03:54 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of siddharth.karandikar@gmail.com designates 209.85.214.176 as permitted sender) Received: from [209.85.214.176] (HELO mail-iw0-f176.google.com) (209.85.214.176) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 30 Jun 2010 10:03:47 +0000 Received: by iwn39 with SMTP id 39so620698iwn.35 for ; Wed, 30 Jun 2010 03:02:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=nalM05Ooj4qthpQ0kTNJh9oZXdOfTV+t9v3aokna2Pc=; b=WtMOxJhtYgSCzcWWdSp0RR0R46r0mhokXNiSOr797n1JNs89CRI4qQyXbq3ohkjix8 fIJYI3ZuJcqO2eex7CJzJdZ+HL5agRLWgbbnWO2+3+X0FJhf7XVlwClObVPP2gzZWhQp rWLtWVGVEHNOtMkBeIYpx3TdJqwXid6TWJ+48= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=xGmlZsoQn3xKaPPKJUEKOY6cn3qVKkjy3TkackTdm23+rUAsaG3VDSN+qX6IynHWu5 Gh0O/HhF8yCK/SJUwfVqzZ2cOL7u9sdcpdThQNTZkjZWYF2VwGsuQW+pJ7J926AyQg2o Wt0D1D+CVJE4XiFCHFxspjICLZygGb2C3yhiY= MIME-Version: 1.0 Received: by 10.231.160.205 with SMTP id o13mr8914035ibx.111.1277892146260; Wed, 30 Jun 2010 03:02:26 -0700 (PDT) Received: by 10.231.31.194 with HTTP; Wed, 30 Jun 2010 03:02:26 -0700 (PDT) In-Reply-To: References: Date: Wed, 30 Jun 2010 15:32:26 +0530 Message-ID: Subject: Re: newbie - job failing at reduce From: Siddharth Karandikar To: common-user@hadoop.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Anyone? On Tue, Jun 29, 2010 at 8:41 PM, Siddharth Karandikar wrote: > Hi All, > > I am new to Hadoop, but by reading online docs and other resource, I > have moved ahead and now trying to run a cluster of 3 nodes. > Before doing this, tried my program on standalone and pseudo systems > and thats working fine. > > Now the issue that I am facing - mapping phase works correctly. While > doing reduce, I am seeing following error on one of the nodes - > > 2010-06-29 14:35:01,848 WARN org.apache.hadoop.mapred.TaskTracker: > getMapOutput(attempt_201006291958_0001_m_000008_0,0) failed : > org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find > taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_00= 0008_0/output/file.out.index > in any of the configured local directories > > Lets say this is @ Node1. But there is no such directory named > 'taskTracker/jobcache/job_201006291958_0001/attempt_201006291958_0001_m_0= 00008_0' > under /tmp/mapred/local/taskTracker/ on Node1. Interestingly, this > directory is available on Node2 (or Node3). Tried running the job > multiple times, but its always failing while reducing. Same error. > > I have configured /tmp/mapred/local on each node from mapred-site.xml. > > I really don't understand why mappers are misplacing these files? Or > am I missing something in configuration? > > If someone wants to look @ configurations, I have pasted that below. > > Thanks, > Siddharth > > > Configurations > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > conf/core-site.xml > --------------------------- > > > > > =C2=A0 > =C2=A0 =C2=A0fs.default.name > =C2=A0 =C2=A0hdfs://192.168.2.115/ > =C2=A0 > > > > conf/hdfs-site.xml > -------------------------- > > > > =C2=A0 > =C2=A0 =C2=A0fs.default.name > =C2=A0 =C2=A0hdfs://192.168.2.115 > =C2=A0 > =C2=A0 > =C2=A0 =C2=A0dfs.data.dir > =C2=A0 =C2=A0/home/siddharth/hdfs/data > =C2=A0 > =C2=A0 > =C2=A0 =C2=A0dfs.name.dir > =C2=A0 =C2=A0/home/siddharth/hdfs/name > =C2=A0 > =C2=A0 > =C2=A0 =C2=A0dfs.replication > =C2=A0 =C2=A03 > =C2=A0 > > > conf/mapred-site.xml > ------------------------------ > > > > =C2=A0 > =C2=A0 =C2=A0mapred.job.tracker > =C2=A0 =C2=A0192.168.2.115:8021 > =C2=A0 > =C2=A0 > =C2=A0 =C2=A0mapred.local.dir > =C2=A0 =C2=A0/tmp/mapred/local > =C2=A0 =C2=A0true > =C2=A0 > =C2=A0 > =C2=A0 =C2=A0mapred.system.dir > =C2=A0 =C2=A0hdfs://192.168.2.115/maperdsystem > =C2=A0 =C2=A0true > =C2=A0 > =C2=A0 > =C2=A0 =C2=A0mapred.tasktracker.map.tasks.maximum > =C2=A0 =C2=A04 > =C2=A0 =C2=A0true > =C2=A0 > =C2=A0 > =C2=A0 =C2=A0mapred.tasktracker.reduce.tasks.maximum > =C2=A0 =C2=A04 > =C2=A0 =C2=A0true > =C2=A0 > =C2=A0 > =C2=A0 =C2=A0mapred.child.java.opts > =C2=A0 =C2=A0-Xmx512m > =C2=A0 =C2=A0 > =C2=A0 > >