Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 58C1710814 for ; Fri, 3 Jan 2014 15:26:37 +0000 (UTC) Received: (qmail 47453 invoked by uid 500); 3 Jan 2014 15:23:00 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 46976 invoked by uid 500); 3 Jan 2014 15:22:40 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 46955 invoked by uid 99); 3 Jan 2014 15:22:36 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Jan 2014 15:22:36 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of navaz.enc@gmail.com designates 209.85.212.170 as permitted sender) Received: from [209.85.212.170] (HELO mail-wi0-f170.google.com) (209.85.212.170) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Jan 2014 15:22:29 +0000 Received: by mail-wi0-f170.google.com with SMTP id hq4so570717wib.5 for ; Fri, 03 Jan 2014 07:22:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=HwR4ZdvYiQcc1hvz9RRlMKujF04UTNci445J1/puh6U=; b=ii/qfJDkS0rrvb7v1NGK9NTJYSJLkmd7mgiROMkFL5OPffNoatV7PkvrVqirVlj6la Ks3XeNu1rGzVy++QWDGrui3/rwJrjEWXpSoLb2a4F80f8rL6rEF8ghEBMmj70Jne+/9C CNfgxJg0hA7sgdLNeVJAqzbNpGwnlBmSSLFS9Enoa1MnVLmWthoYoi7YvKhHu3Qxwb8U wEW5SxPM2haKkDv74dyxsXeQjdq54LBzbUQFP6TFT+A/u1okyeHUY1Pv6gDQaFcQK4gu z91Eb7TN/R84sGxSInBla0pN25km79vg8bNH2lWcwfxRDPbtDJuIcvNjHWcqJ04bf4sx LTiQ== MIME-Version: 1.0 X-Received: by 10.194.61.211 with SMTP id s19mr182501wjr.73.1388762528913; Fri, 03 Jan 2014 07:22:08 -0800 (PST) Received: by 10.216.99.137 with HTTP; Fri, 3 Jan 2014 07:22:08 -0800 (PST) In-Reply-To: References: Date: Fri, 3 Jan 2014 09:22:08 -0600 Message-ID: Subject: Re: Map succeeds but reduce hangs From: navaz To: user Content-Type: multipart/alternative; boundary=047d7b86c28c369fb604ef127986 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b86c28c369fb604ef127986 Content-Type: text/plain; charset=ISO-8859-1 My mapred.site xml file is given below. I havent set any mapred.task.tracker.report.address. hduser@pc321:/usr/local/hadoop/conf$ vi mapred-site.xml mapred.job.tracker pc228:54311 The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. On Thu, Jan 2, 2014 at 12:28 PM, Vinod Kumar Vavilapalli < vinodkv@hortonworks.com> wrote: > Check the TaskTracker configuration in mapred-site.xml: > mapred.task.tracker.report.address. You may be setting it to 127.0.0.1:0or localhost:0. Change it to > 0.0.0.0:0 and restart the daemons. > > Thanks, > +Vinod > > On Jan 1, 2014, at 2:14 PM, navaz wrote: > > I dont know y it is running on localhost. I have commented it. > ================================================================== > *slave1:* > Hostname: pc321 > > hduser@pc321:/etc$ vi hosts > #127.0.0.1 localhost loghost localhost.myslice.ch-geni-net.emulab.net > 155.98.39.28 pc228 > 155.98.39.121 pc321 > 155.98.39.27 dn3.myslice.ch-geni-net.emulab.net > ======================================================================== > slave2: > hostname: dn3.myslice.ch-geni-net.emulab.net > hduser@dn3:/etc$ vi hosts > #127.0.0.1 localhost loghost localhost.myslice.ch-geni-net.emulab.net > 155.98.39.28 pc228 > 155.98.39.121 pc321 > 155.98.39.27 dn3.myslice.ch-geni-net.emulab.net > ======================================================================== > Master: > Hostame: pc228 > hduser@pc228:/etc$ vi hosts > #127.0.0.1 localhost loghost localhost.myslice.ch-geni-net.emulab.net > 155.98.39.28 pc228 > 155.98.39.121 pc321 > #155.98.39.19 slave2 > 155.98.39.27 dn3.myslice.ch-geni-net.emulab.net > > ============================================================================ > I have replaced localhost with pc228 in coresite.xml and > mapreduce-site.xml and replication factor as 3. > > I can able to ssh pc321 and dn3.myslice.ch-geni-net.emulab.net from > master. > > > hduser@pc228:/usr/local/hadoop/conf$ more slaves > pc228 > pc321 > dn3.myslice.ch-geni-net.emulab.net > > hduser@pc228:/usr/local/hadoop/conf$ more masters > pc228 > hduser@pc228:/usr/local/hadoop/conf$ > > > > Am i am doing anything wrong here ? > > > On Wed, Jan 1, 2014 at 4:54 PM, Hardik Pandya wrote: > >> do you have your hosnames properly configured in etc/hosts? have you >> tried 192.168.?.? instead of localhost 127.0.0.1 >> >> >> >> On Wed, Jan 1, 2014 at 11:33 AM, navaz wrote: >> >>> Thanks. But I wonder Why map succeeds 100% , How it resolve hostname ? >>> >>> Now reduce becomes 100% but bailing out slave2 and slave 3 . ( But >>> Mappig is succeded for these nodes). >>> >>> Does it looks for hostname only for reduce ? >>> >>> >>> 14/01/01 09:09:38 INFO mapred.JobClient: Running job: >>> job_201401010908_0001 >>> 14/01/01 09:09:39 INFO mapred.JobClient: map 0% reduce 0% >>> 14/01/01 09:10:00 INFO mapred.JobClient: map 33% reduce 0% >>> 14/01/01 09:10:01 INFO mapred.JobClient: map 66% reduce 0% >>> 14/01/01 09:10:05 INFO mapred.JobClient: map 100% reduce 0% >>> 14/01/01 09:10:14 INFO mapred.JobClient: map 100% reduce 22% >>> 14/01/01 09:17:32 INFO mapred.JobClient: map 100% reduce 0% >>> 14/01/01 09:17:35 INFO mapred.JobClient: Task Id : >>> attempt_201401010908_0001_r_000000_0, Status : FAILED >>> Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out. >>> 14/01/01 09:17:46 INFO mapred.JobClient: map 100% reduce 11% >>> 14/01/01 09:17:50 INFO mapred.JobClient: map 100% reduce 22% >>> 14/01/01 09:25:06 INFO mapred.JobClient: map 100% reduce 0% >>> 14/01/01 09:25:10 INFO mapred.JobClient: Task Id : >>> attempt_201401010908_0001_r_000000_1, Status : FAILED >>> Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out. >>> 14/01/01 09:25:34 INFO mapred.JobClient: map 100% reduce 100% >>> 14/01/01 09:25:42 INFO mapred.JobClient: Job complete: >>> job_201401010908_0001 >>> 14/01/01 09:25:42 INFO mapred.JobClient: Counters: 29 >>> >>> >>> >>> Job Tracker logs: >>> 2014-01-01 09:09:59,874 INFO org.apache.hadoop.mapred.JobInProgress: >>> Task 'attempt_201401010908_0001_m_000002_0' has completed task_20140 >>> 1010908_0001_m_000002 successfully. >>> 2014-01-01 09:10:04,231 INFO org.apache.hadoop.mapred.JobInProgress: >>> Task 'attempt_201401010908_0001_m_000001_0' has completed task_20140 >>> 1010908_0001_m_000001 successfully. >>> 2014-01-01 09:17:30,527 INFO org.apache.hadoop.mapred.TaskInProgress: >>> Error from attempt_201401010908_0001_r_000000_0: Shuffle Error: Exc >>> eeded MAX_FAILED_UNIQUE_FETCHES; bailing-out. >>> 2014-01-01 09:17:30,528 INFO org.apache.hadoop.mapred.JobTracker: >>> Removing task 'attempt_201401010908_0001_r_000000_0' >>> 2014-01-01 09:17:30,529 INFO org.apache.hadoop.mapred.JobTracker: Adding >>> task (TASK_CLEANUP) 'attempt_201401010908_0001_r_000000_0' to ti >>> p task_201401010908_0001_r_000000, for tracker 'tracker_slave3:localhost/ >>> 127.0.0.1:44663' >>> 2014-01-01 09:17:35,130 INFO org.apache.hadoop.mapred.JobTracker: >>> Removing task 'attempt_201401010908_0001_r_000000_0' >>> 2014-01-01 09:17:35,213 INFO org.apache.hadoop.mapred.JobTracker: Adding >>> task (REDUCE) 'attempt_201401010908_0001_r_000000_1' to tip task >>> _201401010908_0001_r_000000, for tracker 'tracker_slave2:localhost/ >>> 127.0.0.1:51438' >>> 2014-01-01 09:25:05,493 INFO org.apache.hadoop.mapred.TaskInProgress: >>> Error from attempt_201401010908_0001_r_000000_1: Shuffle Error: Exc >>> eeded MAX_FAILED_UNIQUE_FETCHES; bailing-out. >>> 2014-01-01 09:25:05,493 INFO org.apache.hadoop.mapred.JobTracker: >>> Removing task 'attempt_201401010908_0001_r_000000_1' >>> 2014-01-01 09:25:05,494 INFO org.apache.hadoop.mapred.JobTracker: Adding >>> task (TASK_CLEANUP) 'attempt_201401010908_0001_r_000000_1' to ti >>> p task_201401010908_0001_r_000000, for tracker 'tracker_slave2:localhost/ >>> 127.0.0.1:51438' >>> 2014-01-01 09:25:10,087 INFO org.apache.hadoop.mapred.JobTracker: >>> Removing task 'attempt_201401010908_0001_r_000000_1' >>> 2014-01-01 09:25:10,109 INFO org.apache.hadoop.mapred.JobTracker: Adding >>> task (REDUCE) 'attempt_201401010908_0001_r_000000_2' to tip task >>> _201401010908_0001_r_000000, for tracker 'tracker_master:localhost/ >>> 127.0.0.1:57156' >>> 2014-01-01 09:25:33,340 INFO org.apache.hadoop.mapred.JobInProgress: >>> Task 'attempt_201401010908_0001_r_000000_2' has completed task_20140 >>> 1010908_0001_r_000000 successfully. >>> 2014-01-01 09:25:33,462 INFO org.apache.hadoop.mapred.JobTracker: Adding >>> task (JOB_CLEANUP) 'attempt_201401010908_0001_m_000003_0' to tip >>> task_201401010908_0001_m_000003, for tracker 'tracker_master:localhost/ >>> 127.0.0.1:57156' >>> 2014-01-01 09:25:42,304 INFO org.apache.hadoop.mapred.JobInProgress: >>> Task 'attempt_201401010908_0001_m_000003_0' has completed task_20140 >>> 1010908_0001_m_000003 successfully. >>> >>> >>> On Tue, Dec 31, 2013 at 4:56 PM, Hardik Pandya wrote: >>> >>>> as expected, its failing during shuffle >>>> >>>> it seems like hdfs could not resolve the DNS name for slave nodes >>>> >>>> have your configured your slaves host names correctly? >>>> >>>> 2013-12-31 14:27:54,207 INFO org.apache.hadoop.mapred.TaskInProgress: >>>> Error from attempt_201312311107_0003_r_000000_0: Shuffle Error: Exc >>>> eeded MAX_FAILED_UNIQUE_FETCHES; bailing-out. >>>> 2013-12-31 14:27:54,208 INFO org.apache.hadoop.mapred.JobTracker: >>>> Removing task 'attempt_201312311107_0003_r_000000_0' >>>> 2013-12-31 14:27:54,209 INFO org.apache.hadoop.mapred.JobTracker: >>>> Adding task (TASK_CLEANUP) 'attempt_201312311107_0003_r_000000_0' to ti >>>> p task_201312311107_0003_r_000000, for tracker >>>> 'tracker_slave2:localhost/127.0.0.1:52677' >>>> 2013-12-31 14:27:58,797 INFO org.apache.hadoop.mapred.JobTracker: >>>> Removing task 'attempt_201312311107_0003_r_000000_0' >>>> 2013-12-31 14:27:58,815 INFO org.apache.hadoop.mapred.JobTracker: >>>> Adding task (REDUCE) 'attempt_201312311107_0003_r_000000_1' to tip task >>>> _201312311107_0003_r_000000, for tracker 'tracker_slave1:localhost/ >>>> 127.0.0.1:57492' >>>> >>>> >>>> >>>> >>>> On Tue, Dec 31, 2013 at 4:42 PM, navaz wrote: >>>> >>>>> Hi >>>>> >>>>> My hdfs-site is configured for 4 nodes. ( One is master and 3 slaves) >>>>> >>>>> >>>>> dfs.replication >>>>> 4 >>>>> >>>>> start-dfs.sh and stop-mapred.sh doesnt solve the problem. >>>>> >>>>> Also tried to run the program after formatting the namenode(Master) >>>>> which also fails. >>>>> >>>>> My jobtracker logs on the master ( name node) is give below. >>>>> >>>>> >>>>> >>>>> 2013-12-31 14:27:35,534 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> job_201312311107_0004: nMaps=3 nReduces=1 max=-1 >>>>> 2013-12-31 14:27:35,594 INFO org.apache.hadoop.mapred.JobTracker: Job >>>>> job_201312311107_0004 added successfully for user 'hduser' to queue >>>>> 'default' >>>>> 2013-12-31 14:27:35,594 INFO org.apache.hadoop.mapred.AuditLogger: >>>>> USER=hduser IP=155.98.39.28 OPERATION=SUBMIT_JOB TARGET=job_201312 >>>>> 311107_0004 RESULT=SUCCESS >>>>> 2013-12-31 14:27:35,594 INFO org.apache.hadoop.mapred.JobTracker: >>>>> Initializing job_201312311107_0004 >>>>> 2013-12-31 14:27:35,595 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> Initializing job_201312311107_0004 >>>>> 2013-12-31 14:27:35,785 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> jobToken generated and stored with users keys in /app/hadoop/tmp/map >>>>> red/system/job_201312311107_0004/jobToken >>>>> 2013-12-31 14:27:35,795 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> Input size for job job_201312311107_0004 = 3671523. Number of splits >>>>> = 3 >>>>> 2013-12-31 14:27:35,795 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> tip:task_201312311107_0004_m_000000 has split on node:/default-rack/ >>>>> master >>>>> 2013-12-31 14:27:35,795 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> tip:task_201312311107_0004_m_000000 has split on node:/default-rack/ >>>>> slave2 >>>>> 2013-12-31 14:27:35,796 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> tip:task_201312311107_0004_m_000000 has split on node:/default-rack/ >>>>> slave1 >>>>> 2013-12-31 14:27:35,796 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> tip:task_201312311107_0004_m_000000 has split on node:/default-rack/ >>>>> slave3 >>>>> 2013-12-31 14:27:35,796 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> tip:task_201312311107_0004_m_000001 has split on node:/default-rack/ >>>>> master >>>>> 2013-12-31 14:27:35,796 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> tip:task_201312311107_0004_m_000001 has split on node:/default-rack/ >>>>> slave1 >>>>> 2013-12-31 14:27:35,797 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> tip:task_201312311107_0004_m_000001 has split on node:/default-rack/ >>>>> slave3 >>>>> 2013-12-31 14:27:35,797 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> tip:task_201312311107_0004_m_000001 has split on node:/default-rack/ >>>>> slave2 >>>>> 2013-12-31 14:27:35,797 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> tip:task_201312311107_0004_m_000002 has split on node:/default-rack/ >>>>> master >>>>> 2013-12-31 14:27:35,797 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> tip:task_201312311107_0004_m_000002 has split on node:/default-rack/ >>>>> slave1 >>>>> 2013-12-31 14:27:35,797 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> tip:task_201312311107_0004_m_000002 has split on node:/default-rack/ >>>>> slave2 >>>>> 2013-12-31 14:27:35,797 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> tip:task_201312311107_0004_m_000002 has split on node:/default-rack/ >>>>> slave3 >>>>> 2013-12-31 14:27:35,798 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> job_201312311107_0004 LOCALITY_WAIT_FACTOR=1.0 >>>>> 2013-12-31 14:27:35,798 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> Job job_201312311107_0004 initialized successfully with 3 map tasks >>>>> and 1 reduce tasks. >>>>> 2013-12-31 14:27:35,913 INFO org.apache.hadoop.mapred.JobTracker: >>>>> Adding task (JOB_SETUP) 'attempt_201312311107_0004_m_000004_0' to tip t >>>>> ask_201312311107_0004_m_000004, for tracker 'tracker_slave1:localhost/ >>>>> 127.0.0.1:57492' >>>>> 2013-12-31 14:27:40,876 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> Task 'attempt_201312311107_0004_m_000004_0' has completed task_20131 >>>>> 2311107_0004_m_000004 successfully. >>>>> 2013-12-31 14:27:40,878 INFO org.apache.hadoop.mapred.JobTracker: >>>>> Adding task (MAP) 'attempt_201312311107_0004_m_000000_0' to tip task_20 >>>>> 1312311107_0004_m_000000, for tracker 'tracker_slave1:localhost/ >>>>> 127.0.0.1:57492' >>>>> 2013-12-31 14:27:40,878 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> Choosing data-local task task_201312311107_0004_m_000000 >>>>> 2013-12-31 14:27:40,907 INFO org.apache.hadoop.mapred.JobTracker: >>>>> Adding task (MAP) 'attempt_201312311107_0004_m_000001_0' to tip task_20 >>>>> 1312311107_0004_m_000001, for tracker 'tracker_slave2:localhost/ >>>>> 127.0.0.1:52677' >>>>> 2013-12-31 14:27:40,908 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> Choosing data-local task task_201312311107_0004_m_000001 >>>>> 2013-12-31 14:27:41,122 INFO org.apache.hadoop.mapred.JobTracker: >>>>> Adding task (MAP) 'attempt_201312311107_0004_m_000002_0' to tip task_20 >>>>> 1312311107_0004_m_000002, for tracker 'tracker_slave3:localhost/ >>>>> 127.0.0.1:46845' >>>>> 2013-12-31 14:27:41,123 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> Choosing data-local task task_201312311107_0004_m_000002 >>>>> 2013-12-31 14:27:49,659 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> Task 'attempt_201312311107_0004_m_000002_0' has completed task_20131 >>>>> 2311107_0004_m_000002 successfully. >>>>> 2013-12-31 14:27:49,662 INFO org.apache.hadoop.mapred.JobTracker: >>>>> Adding task (REDUCE) 'attempt_201312311107_0004_r_000000_0' to tip task >>>>> _201312311107_0004_r_000000, for tracker 'tracker_slave3:localhost/ >>>>> 127.0.0.1:46845' >>>>> 2013-12-31 14:27:50,338 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> Task 'attempt_201312311107_0004_m_000000_0' has completed task_20131 >>>>> 2311107_0004_m_000000 successfully. >>>>> 2013-12-31 14:27:51,168 INFO org.apache.hadoop.mapred.JobInProgress: >>>>> Task 'attempt_201312311107_0004_m_000001_0' has completed task_20131 >>>>> 2311107_0004_m_000001 successfully. >>>>> 2013-12-31 14:27:54,207 INFO org.apache.hadoop.mapred.TaskInProgress: >>>>> Error from attempt_201312311107_0003_r_000000_0: Shuffle Error: Exc >>>>> eeded MAX_FAILED_UNIQUE_FETCHES; bailing-out. >>>>> 2013-12-31 14:27:54,208 INFO org.apache.hadoop.mapred.JobTracker: >>>>> Removing task 'attempt_201312311107_0003_r_000000_0' >>>>> 2013-12-31 14:27:54,209 INFO org.apache.hadoop.mapred.JobTracker: >>>>> Adding task (TASK_CLEANUP) 'attempt_201312311107_0003_r_000000_0' to ti >>>>> p task_201312311107_0003_r_000000, for tracker >>>>> 'tracker_slave2:localhost/127.0.0.1:52677' >>>>> 2013-12-31 14:27:58,797 INFO org.apache.hadoop.mapred.JobTracker: >>>>> Removing task 'attempt_201312311107_0003_r_000000_0' >>>>> 2013-12-31 14:27:58,815 INFO org.apache.hadoop.mapred.JobTracker: >>>>> Adding task (REDUCE) 'attempt_201312311107_0003_r_000000_1' to tip task >>>>> _201312311107_0003_r_000000, for tracker 'tracker_slave1:localhost/ >>>>> 127.0.0.1:57492' >>>>> hduser@pc228:/usr/local/hadoop/logs$ >>>>> >>>>> >>>>> I am referring the below document to configure hadoop cluster. >>>>> >>>>> >>>>> http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/ >>>>> >>>>> Did i miss something ? Pls guide. >>>>> >>>>> Thanks >>>>> Navaz >>>>> >>>>> >>>>> On Tue, Dec 31, 2013 at 3:25 PM, Hardik Pandya >>>> > wrote: >>>>> >>>>>> what does your job log says? is yout hdfs-site configured properly to >>>>>> find 3 data nodes? this could very well getting stuck in shuffle phase >>>>>> >>>>>> last thing to try : does stop-all and start-all helps? even worse try >>>>>> formatting namenode >>>>>> >>>>>> >>>>>> On Tue, Dec 31, 2013 at 11:40 AM, navaz wrote: >>>>>> >>>>>>> Hi >>>>>>> >>>>>>> >>>>>>> I am running Hadoop cluster with 1 name node and 3 data nodes. >>>>>>> >>>>>>> My HDFS looks like this. >>>>>>> >>>>>>> hduser@nm:/usr/local/hadoop$ hadoop fs -ls >>>>>>> /user/hduser/getty/gutenberg >>>>>>> Warning: $HADOOP_HOME is deprecated. >>>>>>> >>>>>>> Found 7 items >>>>>>> -rw-r--r-- 4 hduser supergroup 343691 2013-12-30 19:12 >>>>>>> /user/hduser/getty/gutenberg/pg132.txt >>>>>>> -rw-r--r-- 4 hduser supergroup 594933 2013-12-30 19:12 >>>>>>> /user/hduser/getty/gutenberg/pg1661.txt >>>>>>> -rw-r--r-- 4 hduser supergroup 1945886 2013-12-30 19:12 >>>>>>> /user/hduser/getty/gutenberg/pg19699.txt >>>>>>> -rw-r--r-- 4 hduser supergroup 674570 2013-12-30 19:12 >>>>>>> /user/hduser/getty/gutenberg/pg20417.txt >>>>>>> -rw-r--r-- 4 hduser supergroup 1573150 2013-12-30 19:12 >>>>>>> /user/hduser/getty/gutenberg/pg4300.txt >>>>>>> -rw-r--r-- 4 hduser supergroup 1423803 2013-12-30 19:12 >>>>>>> /user/hduser/getty/gutenberg/pg5000.txt >>>>>>> -rw-r--r-- 4 hduser supergroup 393968 2013-12-30 19:12 >>>>>>> /user/hduser/getty/gutenberg/pg972.txt >>>>>>> hduser@nm:/usr/local/hadoop$ >>>>>>> >>>>>>> When i start mapreduce wordcount program it gives 100% mapping and >>>>>>> reduce is hangs at 14%. >>>>>>> >>>>>>> hduser@nm:~$ hadoop jar chiu-wordcount2.jar WordCount >>>>>>> /user/hduser/getty/gutenberg /user/hduser/getty/gutenberg_out3 >>>>>>> Warning: $HADOOP_HOME is deprecated. >>>>>>> >>>>>>> 13/12/31 09:31:07 WARN mapred.JobClient: Use GenericOptionsParser >>>>>>> for parsing the arguments. Applications should implement Tool for the same. >>>>>>> 13/12/31 09:31:07 INFO input.FileInputFormat: Total input paths to >>>>>>> process : 7 >>>>>>> 13/12/31 09:31:08 INFO util.NativeCodeLoader: Loaded the >>>>>>> native-hadoop library >>>>>>> 13/12/31 09:31:08 WARN snappy.LoadSnappy: Snappy native library not >>>>>>> loaded >>>>>>> 13/12/31 09:31:08 INFO mapred.JobClient: Running job: >>>>>>> job_201312310929_0001 >>>>>>> 13/12/31 09:31:09 INFO mapred.JobClient: map 0% reduce 0% >>>>>>> 13/12/31 09:31:29 INFO mapred.JobClient: map 14% reduce 0% >>>>>>> 13/12/31 09:31:34 INFO mapred.JobClient: map 32% reduce 0% >>>>>>> 13/12/31 09:31:35 INFO mapred.JobClient: map 75% reduce 0% >>>>>>> 13/12/31 09:31:36 INFO mapred.JobClient: map 90% reduce 0% >>>>>>> 13/12/31 09:31:37 INFO mapred.JobClient: map 99% reduce 0% >>>>>>> 13/12/31 09:31:38 INFO mapred.JobClient: map 100% reduce 0% >>>>>>> 13/12/31 09:31:43 INFO mapred.JobClient: map 100% reduce 14% >>>>>>> >>>>>>> >>>>>>> >>>>>>> Could you please help me in resolving this issue. >>>>>>> >>>>>>> >>>>>>> Thanks & Regards >>>>>>> *Abdul Navaz* >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> *Abdul Navaz* >>>>> *Masters in Network Communications* >>>>> *University of Houston* >>>>> *Houston, TX - 77204-4020* >>>>> *Ph - 281-685-0388 <281-685-0388>* >>>>> *fabdulnavaz@uh.edu* >>>>> >>>>> >>>> >>> >>> >>> -- >>> *Abdul Navaz* >>> *Masters in Network Communications* >>> *University of Houston* >>> *Houston, TX - 77204-4020* >>> *Ph - 281-685-0388 <281-685-0388>* >>> *fabdulnavaz@uh.edu* >>> >>> >> > > > -- > *Abdul Navaz* > *Masters in Network Communications* > *University of Houston* > *Houston, TX - 77204-4020* > *Ph - 281-685-0388 <281-685-0388>* > *fabdulnavaz@uh.edu* > > > > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity > to which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. -- *Abdul Navaz* *Masters in Network Communications* *University of Houston* *Houston, TX - 77204-4020* *Ph - 281-685-0388* *fabdulnavaz@uh.edu* --047d7b86c28c369fb604ef127986 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
My mapred.site xml file is given below. =A0I havent set an= y=A0mapred.task= .tracker.report.address.

hduser@pc321:/usr/local/hadoop/conf$ vi mapred-site.xml=
<?xml version=3D"1.0"?>
<?xml-style= sheet type=3D"text/xsl" href=3D"configuration.xsl"?>=

<!-- Put site-specific property overrides in this fi= le. -->

<configuration>
<pro= perty>
=A0<name>mapred.job.tracker</name>
=A0<value>pc228:54311</value>
=A0<description>T= he host and port that the MapReduce job tracker runs
=A0at. If &q= uot;local", then jobs are run in-process as a single map
=A0= and reduce task.
=A0</description>
</property>
</con= figuration>

<= br>
On Thu, Jan 2, 2014 at 12:28 PM, Vinod Ku= mar Vavilapalli <vinodkv@hortonworks.com> wrote:
Check th= e TaskTracker configuration in mapred-site.xml: mapred.task.tracker.report.= address. You may be setting it to 127.0.0.1:0 or localhost:0. Change it to 0.0.0.0:0 and restart the daemons.

Thanks,
+Vinod

On Jan 1, 2014, at 2:14 PM, navaz <navaz.enc@gmail.com> wrote:
<= br>
I dont know y it is running o= n localhost. I have commented it.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
slave1:
Hostname: pc321

hduser@pc321:/etc$ vi hosts
#127.0.0.1 =A0 =A0 =A0localhost = loghost localhost.myslice.ch-geni-net.emulab.net
155.98.3= 9.28 =A0 =A0pc228
155.98.39.121 =A0 pc321
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D
slave2:
=
hduser@dn3:/etc$ vi hosts
#127.0.0.1 =A0 =A0 =A0loc= alhost loghost localhost.myslice.ch-geni-net.emulab.net
155.98.39.28 =A0 =A0pc228
155.98.39.121 =A0 pc321
= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Master:
Hostame: pc228
hduser@pc228:/etc$ vi = hosts
#127.0.0.1 =A0 =A0 =A0localhost loghost localhost.mysli= ce.ch-geni-net.emulab.net
155.98.39.28 =A0 pc228
155.98.39.121 =A0pc321
#155= .98.39.19 =A0 slave2
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D
I have replaced localhost with pc228 in coresite.xml and mapr= educe-site.xml and replication factor as 3.

I can = able to ssh pc321 and dn3.myslice.ch-geni-net.emulab.net from master.


hduser@pc228:/usr/local/hadoo= p/conf$ more slaves
pc228
pc321

hduser@pc228:/usr/local/hadoop/conf$ more masters
=
pc228
hduser@pc228:/usr/local/hadoop/conf$
<= br>


Am i am doing anything wrong he= re ?


On Wed,= Jan 1, 2014 at 4:54 PM, Hardik Pandya <smarty.juice@gmail.com>= ; wrote:
do you have your hosnames p= roperly configured in etc/hosts? have you tried 192.168.?.? instead of loca= lhost 127.0.0.1



On Wed, Jan 1, 2014 at 11:33 AM, navaz <navaz.enc@gmail.com> wrote:
Thanks. But I wonder Why map succeeds 100% , How it resolv= e hostname ?

Now reduce becomes 100% but bailing out sla= ve2 and slave 3 . ( But Mappig is succeded for these nodes).

Does it looks for hostname only for reduce ?

=
14/01/01 09:09:38 INFO mapred.JobClient: Running job: j= ob_201401010908_0001
14/01/01 09:09:39 INFO mapred.JobClient: =A0= map 0% reduce 0%
14/01/01 09:10:00 INFO mapred.JobClient: =A0map 33% reduce 0%
14/01/01 09:10:01 INFO mapred.JobClient: =A0map 66% reduce 0%
1= 4/01/01 09:10:05 INFO mapred.JobClient: =A0map 100% reduce 0%
14/= 01/01 09:10:14 INFO mapred.JobClient: =A0map 100% reduce 22%
14/01/01 09:17:32 INFO mapred.JobClient: =A0map 100% reduce 0%
14/01/01 09:17:35 INFO mapred.JobClient: Task Id : attempt_201401010908_= 0001_r_000000_0, Status : FAILED
Shuffle Error: Exceeded MAX_FAIL= ED_UNIQUE_FETCHES; bailing-out.
14/01/01 09:17:46 INFO mapred.JobClient: =A0map 100% reduce 11%
<= div>14/01/01 09:17:50 INFO mapred.JobClient: =A0map 100% reduce 22%
14/01/01 09:25:06 INFO mapred.JobClient: =A0map 100% reduce 0%
14/01/01 09:25:10 INFO mapred.JobClient: Task Id : attempt_201401010908_000= 1_r_000000_1, Status : FAILED
Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
<= div>14/01/01 09:25:34 INFO mapred.JobClient: =A0map 100% reduce 100%
<= div>14/01/01 09:25:42 INFO mapred.JobClient: Job complete: job_201401010908= _0001
14/01/01 09:25:42 INFO mapred.JobClient: Counters: 29
=


Job Tracker logs:
<= div>
2014-01-01 09:09:59,874 INFO org.apache.hadoop.mapred.JobInProgres= s: Task 'attempt_201401010908_0001_m_000002_0' has completed task_2= 0140
1010908_0001_m_000002 successfully.
2014-01-01 09:10:04,231 = INFO org.apache.hadoop.mapred.JobInProgress: Task 'attempt_201401010908= _0001_m_000001_0' has completed task_20140
1010908_0001_m_000= 001 successfully.
2014-01-01 09:17:30,527 INFO org.apache.hadoop.mapred.TaskInProgress: = Error from attempt_201401010908_0001_r_000000_0: Shuffle Error: Exc
eeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
2014-01-01 09:17= :30,528 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attemp= t_201401010908_0001_r_000000_0'
2014-01-01 09:17:30,529 INFO org.apache.hadoop.mapred.JobTracker: Addi= ng task (TASK_CLEANUP) 'attempt_201401010908_0001_r_000000_0' to ti=
p task_201401010908_0001_r_000000, for tracker 'tracker_slav= e3:localhost/127.0.0.= 1:44663'
2014-01-01 09:17:35,130 INFO org.apache.hadoop.mapred.JobTracker: Remo= ving task 'attempt_201401010908_0001_r_000000_0'
2014-01-= 01 09:17:35,213 INFO org.apache.hadoop.mapred.JobTracker: Adding task (REDU= CE) 'attempt_201401010908_0001_r_000000_1' to tip task
_201401010908_0001_r_000000, for tracker 'tracker_slave2:localhost= /127.0.0.1:51438&= #39;
2014-01-01 09:25:05,493 INFO org.apache.hadoop.mapred.TaskIn= Progress: Error from attempt_201401010908_0001_r_000000_1: Shuffle Error: E= xc
eeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
2014-01-01 09:= 25:05,493 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'atte= mpt_201401010908_0001_r_000000_1'
2014-01-01 09:25:05,494 INF= O org.apache.hadoop.mapred.JobTracker: Adding task (TASK_CLEANUP) 'atte= mpt_201401010908_0001_r_000000_1' to ti
p task_201401010908_0001_r_000000, for tracker 'tracker_slave2:loc= alhost/127.0.0.1:5143= 8'
2014-01-01 09:25:10,087 INFO org.apache.hadoop.mapred.= JobTracker: Removing task 'attempt_201401010908_0001_r_000000_1'
2014-01-01 09:25:10,109 INFO org.apache.hadoop.mapred.JobTracker: Addi= ng task (REDUCE) 'attempt_201401010908_0001_r_000000_2' to tip task=
_201401010908_0001_r_000000, for tracker 'tracker_master:loc= alhost/127.0.0.1:5715= 6'
2014-01-01 09:25:33,340 INFO org.apache.hadoop.mapred.JobInProgress: T= ask 'attempt_201401010908_0001_r_000000_2' has completed task_20140=
1010908_0001_r_000000 successfully.
2014-01-01 09:25:3= 3,462 INFO org.apache.hadoop.mapred.JobTracker: Adding task (JOB_CLEANUP) &= #39;attempt_201401010908_0001_m_000003_0' to tip
=A0task_201401010908_0001_m_000003, for tracker 'tracker_master:lo= calhost/127.0.0.1:571= 56'
2014-01-01 09:25:42,304 INFO org.apache.hadoop.mapred= .JobInProgress: Task 'attempt_201401010908_0001_m_000003_0' has com= pleted task_20140
1010908_0001_m_000003 successfully.


On Tue, Dec 31, 2013 at= 4:56 PM, Hardik Pandya <smarty.juice@gmail.com> wrote:=
as expected, its fa= iling during shuffle

it see= ms like hdfs could not resolve the DNS name for slave nodes

have your configured your slaves host names correctly?

2013-12-31=A014:27:54,207 INFO org.apache.hadoop.mapred.T= askInProgress: Error from attempt_201312311107_0003_r_000000_0: Shuffle Err= or: Exc
eeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
<= span>2013-12-31=A014:27:54,208 INFO org.apache.hadoop.m= apred.JobTracker: Removing task 'attempt_201312311107_0003_r_000000_0&#= 39;
2013-12-31=A014:27:54,209 INFO org.apache= .hadoop.mapred.JobTracker: Adding task (TASK_CLEANUP) 'attempt_20131231= 1107_0003_r_000000_0' to ti
p task_201312311107_0003_r_000000, for tracker 'tracker_slave2:= localhost/127.0.0.1:5= 2677'
2013-12-31=A014:27:58,797 INFO org.apache= .hadoop.mapred.JobTracker: Removing task 'attempt_201312311107_0003_r_0= 00000_0'
2013-12-31=A014:27:58,815 INFO org.apache= .hadoop.mapred.JobTracker: Adding task (REDUCE) 'attempt_201312311107_0= 003_r_000000_1' to tip task
_201312311107_0003_r_000000, for tracker 'tracker_slave1:localh= ost/127.0.0.1:57492'




On Tue, Dec 31, 2013 at 4:42 PM, navaz <navaz.enc@gmail.com> wrote:
Hi

My hdfs-site is configure= d for 4 nodes. ( One is master and 3 slaves)

= <property>
=A0<name>dfs.replication</name>
=A0<value>4</value>

start-dfs.sh and stop-mapred.sh doesnt solve the problem.

<= /div>
Also tried to run = the program after formatting the namenode(Master) which also fails.<= /div>

My jobtracker logs= on the master ( name node) is give below.



2= 013-12-31 14:27:35,534 INFO org.apache.hadoop.mapred.JobInProgress: job_201= 312311107_0004: nMaps=3D3 nReduces=3D1 max=3D-1
2013-12-31 14:27:35,594 INFO org.apache.hadoop.mapred.JobTrac= ker: Job job_201312311107_0004 added successfully for user 'hduser'= to queue
=A0'default'
2013-12-31 14:27:35,594 INFO org.apache.hadoop.mapred.Audit= Logger: USER=3Dhduser =A0IP=3D155.98.39.28 OPERATION=3DSUBMIT_JOB =A0 =A0TA= RGET=3Djob_201312
311107_0004 =A0 =A0 RESULT=3DSUCCESS
2013-12-31 14:27:35,594 INFO org.apache.had= oop.mapred.JobTracker: Initializing job_201312311107_0004
= 2013-12-31 14:27:35,595 INFO org.apache.hadoop.mapr= ed.JobInProgress: Initializing job_201312311107_0004
2013-12-31 14:27:35,785 INFO org.apache.hadoop= .mapred.JobInProgress: jobToken generated and stored with users keys in /ap= p/hadoop/tmp/map
red/system/job_20= 1312311107_0004/jobToken
2013-12-31 14:27:35,795 INFO org.apache.hadoop= .mapred.JobInProgress: Input size for job job_201312311107_0004 =3D 3671523= . Number of splits
=A0=3D 3=
2013-12-31 14:27:35,795 INFO org.apache.hadoop= .mapred.JobInProgress: tip:task_201312311107_0004_m_000000 has split on nod= e:/default-rack/
master
2013-12-31 14:27:35,795 INFO org.apache.hadoop= .mapred.JobInProgress: tip:task_201312311107_0004_m_000000 has split on nod= e:/default-rack/
slave2
2013-12-31 14:27:35,796 INFO org.apache.hadoop= .mapred.JobInProgress: tip:task_201312311107_0004_m_000000 has split on nod= e:/default-rack/
slave1
2013-12-31 14:27:35,796 INFO org.apache.hadoop= .mapred.JobInProgress: tip:task_201312311107_0004_m_000000 has split on nod= e:/default-rack/
slave3
2013-12-31 14:27:35,796 INFO org.apache.hadoop= .mapred.JobInProgress: tip:task_201312311107_0004_m_000001 has split on nod= e:/default-rack/
master
2013-12-31 14:27:35,796 INFO org.apache.hadoop= .mapred.JobInProgress: tip:task_201312311107_0004_m_000001 has split on nod= e:/default-rack/
slave1
2013-12-31 14:27:35,797 INFO org.apache.hadoop= .mapred.JobInProgress: tip:task_201312311107_0004_m_000001 has split on nod= e:/default-rack/
slave3
2013-12-31 14:27:35,797 INFO org.apache.hadoop= .mapred.JobInProgress: tip:task_201312311107_0004_m_000001 has split on nod= e:/default-rack/
slave2
2013-12-31 14:27:35,797 INFO org.apache.hadoop= .mapred.JobInProgress: tip:task_201312311107_0004_m_000002 has split on nod= e:/default-rack/
master
2013-12-31 14:27:35,797 INFO org.apache.hadoop= .mapred.JobInProgress: tip:task_201312311107_0004_m_000002 has split on nod= e:/default-rack/
slave1
2013-12-31 14:27:35,797 INFO org.apache.hadoop= .mapred.JobInProgress: tip:task_201312311107_0004_m_000002 has split on nod= e:/default-rack/
slave2
2013-12-31 14:27:35,797 INFO org.apache.hadoop= .mapred.JobInProgress: tip:task_201312311107_0004_m_000002 has split on nod= e:/default-rack/
slave3
2013-12-31 14:27:35,798 INFO org.apache.hadoop= .mapred.JobInProgress: job_201312311107_0004 LOCALITY_WAIT_FACTOR=3D1.0
2013-12-31 14:27:35,798 INFO org.apac= he.hadoop.mapred.JobInProgress: Job job_201312311107_0004 initialized succe= ssfully with 3 map tasks
and 1 reduce tasks.
2013-12-31 14:27:35,913 INFO org.apache.hadoop.mapred.JobTra= cker: Adding task (JOB_SETUP) 'attempt_201312311107_0004_m_000004_0'= ; to tip t
ask_201312311107_0004_m_000004, for tracker &#= 39;tracker_slave1:localhost/127.0.0.1:57492'
20= 13-12-31 14:27:40,876 INFO org.apache.hadoop.mapred.JobInProgress: Task = 9;attempt_201312311107_0004_m_000004_0' has completed task_20131=
2311107_0004_m_000004 successfully.
2013-12-31 14:27:40,878 INFO org.apache.hado= op.mapred.JobTracker: Adding task (MAP) 'attempt_201312311107_0004_m_00= 0000_0' to tip task_20
1312311107_0004_m_000000, for tracker 'tra= cker_slave1:localhost/127.0.0.1:57492'
2013-12-= 31 14:27:40,878 INFO org.apache.hadoop.mapred.JobInProgress: Choosing data-= local task task_201312311107_0004_m_000000
2013-12-31 14:27:40,907 INFO org.apache.hadoop= .mapred.JobTracker: Adding task (MAP) 'attempt_201312311107_0004_m_0000= 01_0' to tip task_20
131231110= 7_0004_m_000001, for tracker 'tracker_slave2:localhost/127.0.0.1:52677'
2013-12-31 14:27:40,908 INFO org.apache.hadoop= .mapred.JobInProgress: Choosing data-local task task_201312311107_0004_m_00= 0001
2013-12-31 14:27:41,122 INFO = org.apache.hadoop.mapred.JobTracker: Adding task (MAP) 'attempt_2013123= 11107_0004_m_000002_0' to tip task_20
1312311107_0004_m_000002, for tracker 'tra= cker_slave3:localhost/127.0.0.1:46845'
2013-12-= 31 14:27:41,123 INFO org.apache.hadoop.mapred.JobInProgress: Choosing data-= local task task_201312311107_0004_m_000002
2013-12-31 14:27:49,659 INFO org.apache.hadoop= .mapred.JobInProgress: Task 'attempt_201312311107_0004_m_000002_0' = has completed task_20131
2311107_0= 004_m_000002 successfully.
2013-12-31 14:27:49,662 INFO org.apache.hadoop= .mapred.JobTracker: Adding task (REDUCE) 'attempt_201312311107_0004_r_0= 00000_0' to tip task
_20131231= 1107_0004_r_000000, for tracker 'tracker_slave3:localhost/127.0.0.1:46845'
2013-12-31 14:27:50,338 INFO org.apache.hadoop= .mapred.JobInProgress: Task 'attempt_201312311107_0004_m_000000_0' = has completed task_20131
2311107_0= 004_m_000000 successfully.
2013-12-31 14:27:51,168 INFO org.apache.hadoop= .mapred.JobInProgress: Task 'attempt_201312311107_0004_m_000001_0' = has completed task_20131
2311107_0= 004_m_000001 successfully.
2013-12-31 14:27:54,207 INFO org.apache.hadoop= .mapred.TaskInProgress: Error from attempt_201312311107_0003_r_000000_0: Sh= uffle Error: Exc
eeded MAX_FAILED_= UNIQUE_FETCHES; bailing-out.
2013-12-31 14:27:54,208 INFO org.apache.hadoop= .mapred.JobTracker: Removing task 'attempt_201312311107_0003_r_000000_0= '
2013-12-31 14:27:54,209 INFO= org.apache.hadoop.mapred.JobTracker: Adding task (TASK_CLEANUP) 'attem= pt_201312311107_0003_r_000000_0' to ti
p task_201312311107_0003_r_000000, for tracker= 'tracker_slave2:localhost/127.0.0.1:52677'
2013-12-31 14:27:58,797 INFO org.apache.hadoop.mapred.JobTracker: Removi= ng task 'attempt_201312311107_0003_r_000000_0'
2013-12-31 14:27:58,815 INFO org.apache.hadoop= .mapred.JobTracker: Adding task (REDUCE) 'attempt_201312311107_0003_r_0= 00000_1' to tip task
_20131231= 1107_0003_r_000000, for tracker 'tracker_slave1:localhost/127.0.0.1:57492'
hduser@pc228:/usr/local/hadoop/logs$


I am referring the below document to = configure hadoop cluster.


Did i miss something ? Pls guide.

<= /div>
Thanks
Navaz


On Tue, Dec 31, 2013 at 3:25 PM, Hardik= Pandya <smarty.juice@gmail.com> wrote:
what does your job log says= ? is yout hdfs-site configured properly to find 3 data nodes? this could ve= ry well getting stuck in shuffle phase

last thing to try : does stop-all and start-all helps? even = worse try formatting namenode


On Tue, Dec 31, 2013 at 11:40 AM, navaz <navaz.enc@gmail.com&g= t; wrote:
Hi


<= /div>
I am running Hadoop cluster with 1 name node and 3 data nodes.=A0=

My HDFS looks like this.

= hduser@nm:/usr/local/hadoop$ hadoop fs -ls /user/hd= user/getty/gutenberg
Warning: $HADOOP_HOME is deprecated.

Found 7 items
-rw-r--r-- =A0 4 = hduser supergroup =A0 =A0 343691 2013-12-30 19:12 /user/hduser/getty/gutenb= erg/pg132.txt
-rw-r--r-- =A0 4 hduser supergroup =A0 =A0 594= 933 2013-12-30 19:12 /user/hduser/getty/gutenberg/pg1661.txt
-rw-r--r-- =A0 4 hduser supergroup =A0 =A0194588= 6 2013-12-30 19:12 /user/hduser/getty/gutenberg/pg19699.txt
-rw-r--r-- =A0 4 hduser supergroup =A0 =A0 674= 570 2013-12-30 19:12 /user/hduser/getty/gutenberg/pg20417.txt
<= div>-rw-r--r-- =A0 4 hduser supergroup =A0 =A015731= 50 2013-12-30 19:12 /user/hduser/getty/gutenberg/pg4300.txt
-rw-r--r-- =A0 4 hduser supergroup =A0 =A01423= 803 2013-12-30 19:12 /user/hduser/getty/gutenberg/pg5000.txt
-rw-r--r-- =A0 4 hduser supergroup =A0 =A0 39396= 8 2013-12-30 19:12 /user/hduser/getty/gutenberg/pg972.txt
hduser@nm:/usr/local/hadoop$
=
When i start mapreduce wordcount program it gives 100% mappi= ng and reduce is hangs at 14%.

hduser@nm:~$ hadoop jar chiu-wordcount2.jar WordCount /user/hduse= r/getty/gutenberg /user/hduser/getty/gutenberg_out3
Warning: $HADOOP_HOME is deprecated.

13/12/31 09:31:07 WARN mapred.JobClient: Use GenericOptionsParser for pa= rsing the arguments. Applications should implement Tool for the same.
13/12/31 09:31:07 INFO input.FileInputFormat: = Total input paths to process : 7
1= 3/12/31 09:31:08 INFO util.NativeCodeLoader: Loaded the native-hadoop libra= ry
13/12/31 09:31:08 WARN snappy.LoadSnappy: Snap= py native library not loaded
13/12= /31 09:31:08 INFO mapred.JobClient: Running job: job_201312310929_0001
13/12/31 09:31:09 INFO mapred.JobClient: =A0ma= p 0% reduce 0%
13/12/31 09:31:29 I= NFO mapred.JobClient: =A0map 14% reduce 0%
13/12/31 09:31:34 INFO mapred.JobClient: =A0map 32% reduce 0%
13/12/31 09:31:35 INFO mapred.JobClient: =A0ma= p 75% reduce 0%
13/12/31 09:31:36 = INFO mapred.JobClient: =A0map 90% reduce 0%
13/12/31 09:31:37 INFO mapred.JobClient: =A0map 99% reduce 0%
13/12/31 09:31:38 INFO mapred.JobClient: =A0ma= p 100% reduce 0%
13/12/31 09:31:43= INFO mapred.JobClient: =A0map 100% reduce 14%

<= div> <HANGS HEAR>

Could you please help me = in resolving this issue.


Thanks & Re= gards
Abdul Navaz







<= font color=3D"#888888">--
Abdul Navaz
Masters in Network Communications
University of Houston
Houston, = TX - 77204-4020
Ph -=A0281-685-0388=





--
A= bdul Navaz
Masters in Network Communications
University of Houston
Houston, TX - 77204-4020=





--
=
Abdul Navaz
Masters in Network Communications
University of Houston
Houston, TX - 77204-4020=



CONFIDENTIALITY NOTICE
NOTICE: This message is = intended for the use of the individual or entity to which it is addressed a= nd may contain information that is confidential, privileged and exempt from= disclosure under applicable law. If the reader of this message is not the = intended recipient, you are hereby notified that any printing, copying, dis= semination, distribution, disclosure or forwarding of this communication is= strictly prohibited. If you have received this communication in error, ple= ase contact the sender immediately and delete it from your system. Thank Yo= u.


--
Abdul Navaz
Masters in Network Communications
University of Houston
= Houston, TX - 77204-4020
Ph -=A0281-685-0388

--047d7b86c28c369fb604ef127986--