Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0C6F1DA85 for ; Fri, 31 Aug 2012 02:00:25 +0000 (UTC) Received: (qmail 84661 invoked by uid 500); 31 Aug 2012 02:00:19 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 84559 invoked by uid 500); 31 Aug 2012 02:00:19 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 84551 invoked by uid 99); 31 Aug 2012 02:00:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Aug 2012 02:00:19 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of stones.gao@gmail.com designates 209.85.210.176 as permitted sender) Received: from [209.85.210.176] (HELO mail-iy0-f176.google.com) (209.85.210.176) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Aug 2012 02:00:15 +0000 Received: by iagt4 with SMTP id t4so5013811iag.35 for ; Thu, 30 Aug 2012 18:59:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=5ishSOB4WXKMQTqr9t6CPqF3H+H54bJ/l5IryUlx0KE=; b=yW/9d5pgroOZQYA2BCSXEBWf715iUqQexztWl0siNilNVYxu6EyzVVOppmYBGAA+PU 7FZTXAat2Ek56VZSdFYnuLjfUMr2bgCJo4OvUpk4XBAYQvTipG0QFTQv6J06wAVgpfcH 41xJsQ5JiQawv3VI+jJ0Hai1LtS1IY3o/8HVL04zChDSqhCcFsc+9hOBfU5MjRUrKJw8 fwZbnIZMeQsBr10YSfbSmPuk/4Bjj4Nqe9sTosdtwso4W2vktmcgEhaeZ0M9YDQIS8sO xrt3wbwhkMgMBuKDLzvwtPCE05HcuTOIKRTkyuyOXZra9At66GX2PcC7Cey+CsbAsQoY RC1A== MIME-Version: 1.0 Received: by 10.50.1.200 with SMTP id 8mr357131igo.65.1346378394341; Thu, 30 Aug 2012 18:59:54 -0700 (PDT) Received: by 10.64.16.227 with HTTP; Thu, 30 Aug 2012 18:59:54 -0700 (PDT) In-Reply-To: References: Date: Fri, 31 Aug 2012 09:59:54 +0800 Message-ID: Subject: Re: Hadoop in Pseudo-Distributed mode on Mac OS X 10.8 From: Stone To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=e89a8f502e0ced902f04c8862572 X-Virus-Checked: Checked by ClamAV on apache.org --e89a8f502e0ced902f04c8862572 Content-Type: text/plain; charset=ISO-8859-1 No. the content in hadoop-env.sh is like the following: export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS" export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS" export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS" export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS" export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS" export HADOOP_OPTS="-Djava.security.krb5.realm=OX.AC.UK-Djava.security.krb5.kdc=kdc0.ox.ac.uk: kdc1.ox.ac.uk" Best Regards, Stone On Fri, Aug 31, 2012 at 3:31 AM, Harsh J wrote: > Hi Stone, > > Do you have any flags in hadoop-env.sh indicating preference of IPv4 over > IPv6? > > On Thu, Aug 30, 2012 at 11:26 PM, Stone wrote: > > FYI : the jstack output for the hanged job: > > > > Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.8-b03-424 mixed > > mode): > > > > "RMI TCP Accept-0" daemon prio=9 tid=7f833f1ba800 nid=0x109b90000 > runnable > > [109b8f000] > > java.lang.Thread.State: RUNNABLE > > at java.net.PlainSocketImpl.socketAccept(Native Method) > > at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408) > > - locked <7bd6a00a8> (a java.net.SocksSocketImpl) > > at java.net.ServerSocket.implAccept(ServerSocket.java:462) > > at java.net.ServerSocket.accept(ServerSocket.java:430) > > at > > > sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:34) > > at > > > sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:369) > > at > sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:341) > > at java.lang.Thread.run(Thread.java:680) > > > > "Attach Listener" daemon prio=9 tid=7f833d000800 nid=0x108b72000 waiting > on > > condition [00000000] > > java.lang.Thread.State: RUNNABLE > > > > "LeaseChecker" daemon prio=5 tid=7f833f30d000 nid=0x10a000000 waiting on > > condition [109fff000] > > java.lang.Thread.State: TIMED_WAITING (sleeping) > > at java.lang.Thread.sleep(Native Method) > > at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.run(DFSClient.java:1302) > > at java.lang.Thread.run(Thread.java:680) > > > > "IPC Client (47) connection to localhost/127.0.0.1:9001 from stone" > daemon > > prio=5 tid=7f833d0de000 nid=0x109dfa000 in Object.wait() [109df9000] > > java.lang.Thread.State: TIMED_WAITING (on object monitor) > > at java.lang.Object.wait(Native Method) > > - waiting on <7bd6a02e0> (a org.apache.hadoop.ipc.Client$Connection) > > at org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:680) > > - locked <7bd6a02e0> (a org.apache.hadoop.ipc.Client$Connection) > > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:723) > > > > "sendParams-0" daemon prio=5 tid=7f833e1cb000 nid=0x109cf7000 waiting on > > condition [109cf6000] > > java.lang.Thread.State: TIMED_WAITING (parking) > > at sun.misc.Unsafe.park(Native Method) > > - parking to wait for <7c1950890> (a > > java.util.concurrent.SynchronousQueue$TransferStack) > > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196) > > at > > > java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:424) > > at > > > java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:323) > > at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:874) > > at > > > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:945) > > at > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) > > at java.lang.Thread.run(Thread.java:680) > > > > "Low Memory Detector" daemon prio=5 tid=7f833e021800 nid=0x109084000 > > runnable [00000000] > > java.lang.Thread.State: RUNNABLE > > > > "C2 CompilerThread1" daemon prio=9 tid=7f833e021000 nid=0x108f81000 > waiting > > on condition [00000000] > > java.lang.Thread.State: RUNNABLE > > > > "C2 CompilerThread0" daemon prio=9 tid=7f833e020000 nid=0x108e7e000 > waiting > > on condition [00000000] > > java.lang.Thread.State: RUNNABLE > > > > "Signal Dispatcher" daemon prio=9 tid=7f833e01f800 nid=0x108d7b000 > runnable > > [00000000] > > java.lang.Thread.State: RUNNABLE > > > > "Surrogate Locker Thread (Concurrent GC)" daemon prio=5 tid=7f833e01e800 > > nid=0x108c78000 waiting on condition [00000000] > > java.lang.Thread.State: RUNNABLE > > > > "Finalizer" daemon prio=8 tid=7f833e00f000 nid=0x108a6f000 in > Object.wait() > > [108a6e000] > > java.lang.Thread.State: WAITING (on object monitor) > > at java.lang.Object.wait(Native Method) > > - waiting on <7c19c1550> (a java.lang.ref.ReferenceQueue$Lock) > > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) > > - locked <7c19c1550> (a java.lang.ref.ReferenceQueue$Lock) > > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) > > at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) > > > > "Reference Handler" daemon prio=10 tid=7f833e00e000 nid=0x10896c000 in > > Object.wait() [10896b000] > > java.lang.Thread.State: WAITING (on object monitor) > > at java.lang.Object.wait(Native Method) > > - waiting on <7c1950860> (a java.lang.ref.Reference$Lock) > > at java.lang.Object.wait(Object.java:485) > > at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) > > - locked <7c1950860> (a java.lang.ref.Reference$Lock) > > > > "main" prio=5 tid=7f833f000800 nid=0x1008eb000 waiting on condition > > [1008e9000] > > java.lang.Thread.State: TIMED_WAITING (sleeping) > > at java.lang.Thread.sleep(Native Method) > > at > > > org.apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java:1316) > > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:509) > > at org.apache.hadoop.examples.WordCount.main(WordCount.java:67) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > at > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at > > > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) > > at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) > > at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > at > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > > at java.lang.reflect.Method.invoke(Method.java:597) > > at org.apache.hadoop.util.RunJar.main(RunJar.java:197) > > > > "VM Thread" prio=9 tid=7f833e009800 nid=0x108869000 runnable > > > > "Gang worker#0 (Parallel GC Threads)" prio=9 tid=7f833f002000 > > nid=0x103ced000 runnable > > > > "Gang worker#1 (Parallel GC Threads)" prio=9 tid=7f833f002800 > > nid=0x103df0000 runnable > > > > "Gang worker#2 (Parallel GC Threads)" prio=9 tid=7f833f003000 > > nid=0x103ef3000 runnable > > > > "Gang worker#3 (Parallel GC Threads)" prio=9 tid=7f833f004000 > > nid=0x103ff6000 runnable > > > > "Concurrent Mark-Sweep GC Thread" prio=9 tid=7f833f07f000 nid=0x1084e7000 > > runnable > > "VM Periodic Task Thread" prio=10 tid=7f833e033800 nid=0x109187000 > waiting > > on condition > > > > "Exception Catcher Thread" prio=10 tid=7f833f001800 nid=0x100b16000 > runnable > > JNI global references: 1457 > > > > > > > > Best Regards, > > Stone > > > > > > > > > > On Fri, Aug 31, 2012 at 1:24 AM, Stone wrote: > >> > >> I got the same issue today. map tasks finished quickly but reduce is > >> always 0%. I am also running Mac OS X 10.8. (cdh3u4) > >> > >> 12/08/31 01:13:03 INFO mapred.JobClient: map 0% reduce 0% > >> 12/08/31 01:13:07 INFO mapred.JobClient: map 100% reduce 0% > >> 12/08/31 01:23:14 INFO mapred.JobClient: Task Id : > >> attempt_201208310112_0001_r_000000_0, Status : FAILED > >> Task attempt_201208310112_0001_r_000000_0 failed to report status for > 600 > >> seconds. Killing! > >> > >> > >> > >> logs for the reducer : > >> > >> Task Logs: 'attempt_201208310112_0001_r_000000_0' > >> > >> > >> > >> stdout logs > >> > >> > >> stderr logs > >> 2012-08-31 01:13:06.316 java[46834:1203] Unable to load realm info from > >> SCDynamicStore > >> > >> > >> syslog logs > >> 2012-08-31 01:13:06,421 INFO > >> org.apache.hadoop.security.UserGroupInformation: JAAS Configuration > already > >> set up for Hadoop, not re-installing. > >> 2012-08-31 01:13:06,674 WARN org.apache.hadoop.util.NativeCodeLoader: > >> Unable to load native-hadoop library for your platform... using > builtin-java > >> classes where applicable > >> 2012-08-31 01:13:06,848 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: > >> Initializing JVM Metrics with processName=SHUFFLE, sessionId= > >> 2012-08-31 01:13:06,945 INFO org.apache.hadoop.mapred.Task: Using > >> ResourceCalculatorPlugin : null > >> 2012-08-31 01:13:06,957 INFO org.apache.hadoop.mapred.ReduceTask: > >> ShuffleRamManager: MemoryLimit=144965632, MaxSingleShuffleLimit=36241408 > >> 2012-08-31 01:13:06,962 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Thread started: Thread for merging > >> on-disk files > >> 2012-08-31 01:13:06,963 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Thread waiting: Thread for merging > >> on-disk files > >> 2012-08-31 01:13:06,963 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Thread started: Thread for merging > in > >> memory files > >> 2012-08-31 01:13:06,964 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Thread started: Thread for polling > Map > >> Completion Events > >> 2012-08-31 01:13:06,964 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Need another 1 map output(s) where > 0 is > >> already in progress > >> 2012-08-31 01:13:06,965 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Scheduled 0 outputs (0 slow hosts > and0 > >> dup hosts) > >> 2012-08-31 01:13:11,966 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Scheduled 1 outputs (0 slow hosts > and0 > >> dup hosts) > >> 2012-08-31 01:14:07,996 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Need another 1 map output(s) where > 1 is > >> already in progress > >> 2012-08-31 01:14:07,996 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Scheduled 0 outputs (0 slow hosts > and0 > >> dup hosts) > >> 2012-08-31 01:15:08,033 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Need another 1 map output(s) where > 1 is > >> already in progress > >> 2012-08-31 01:15:08,033 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Scheduled 0 outputs (0 slow hosts > and0 > >> dup hosts) > >> 2012-08-31 01:16:08,069 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Need another 1 map output(s) where > 1 is > >> already in progress > >> 2012-08-31 01:16:08,070 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Scheduled 0 outputs (0 slow hosts > and0 > >> dup hosts) > >> 2012-08-31 01:17:08,106 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Need another 1 map output(s) where > 1 is > >> already in progress > >> 2012-08-31 01:17:08,107 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Scheduled 0 outputs (0 slow hosts > and0 > >> dup hosts) > >> 2012-08-31 01:18:08,147 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Need another 1 map output(s) where > 1 is > >> already in progress > >> 2012-08-31 01:18:08,147 INFO org.apache.hadoop.mapred.ReduceTask: > >> attempt_201208310112_0001_r_000000_0 Scheduled 0 outputs (0 slow hosts > and0 > >> dup hosts) > >> > >> I can successfully run the wordcount example on my old macbook (os x > 10.6) > >> previously. > >> > >> Any suggestions ? > >> > >> Best Regards, > >> Stone > >> > >> > >> > >> > >> On Mon, Aug 13, 2012 at 12:21 PM, Subho Banerjee > >> wrote: > >>> > >>> Hello, > >>> > >>> I am running hadoop v1.0.3 in Mac OS X 10.8 with Java_1.6.0_33-b03-424 > >>> > >>> > >>> When running hadoop on pseudo-distributed mode, the map seems to work, > >>> but it cannot compute the reduce. > >>> > >>> 12/08/13 08:58:12 INFO mapred.JobClient: Running job: > >>> job_201208130857_0001 > >>> 12/08/13 08:58:13 INFO mapred.JobClient: map 0% reduce 0% > >>> 12/08/13 08:58:27 INFO mapred.JobClient: map 20% reduce 0% > >>> 12/08/13 08:58:33 INFO mapred.JobClient: map 30% reduce 0% > >>> 12/08/13 08:58:36 INFO mapred.JobClient: map 40% reduce 0% > >>> 12/08/13 08:58:39 INFO mapred.JobClient: map 50% reduce 0% > >>> 12/08/13 08:58:42 INFO mapred.JobClient: map 60% reduce 0% > >>> 12/08/13 08:58:45 INFO mapred.JobClient: map 70% reduce 0% > >>> 12/08/13 08:58:48 INFO mapred.JobClient: map 80% reduce 0% > >>> 12/08/13 08:58:51 INFO mapred.JobClient: map 90% reduce 0% > >>> 12/08/13 08:58:54 INFO mapred.JobClient: map 100% reduce 0% > >>> 12/08/13 08:59:14 INFO mapred.JobClient: Task Id : > >>> attempt_201208130857_0001_m_000000_0, Status : FAILED > >>> Too many fetch-failures > >>> 12/08/13 08:59:14 WARN mapred.JobClient: Error reading task > outputServer > >>> returned HTTP response code: 403 for URL: > >>> > http://10.1.66.17:50060/tasklog?plaintext=true&attemptid=attempt_201208130857_0001_m_000000_0&filter=stdout > >>> 12/08/13 08:59:14 WARN mapred.JobClient: Error reading task > outputServer > >>> returned HTTP response code: 403 for URL: > >>> > http://10.1.66.17:50060/tasklog?plaintext=true&attemptid=attempt_201208130857_0001_m_000000_0&filter=stderr > >>> 12/08/13 08:59:18 INFO mapred.JobClient: map 89% reduce 0% > >>> 12/08/13 08:59:21 INFO mapred.JobClient: map 100% reduce 0% > >>> 12/08/13 09:00:14 INFO mapred.JobClient: Task Id : > >>> attempt_201208130857_0001_m_000001_0, Status : FAILED > >>> Too many fetch-failures > >>> > >>> Here is what I get when I try to see the tasklog using the links given > in > >>> the output > >>> > >>> > >>> > http://10.1.66.17:50060/tasklog?plaintext=true&attemptid=attempt_201208130857_0001_m_000000_0&filter=stderr > >>> ---> > >>> 2012-08-13 08:58:39.189 java[74092:1203] Unable to load realm info from > >>> SCDynamicStore > >>> > >>> > >>> > http://10.1.66.17:50060/tasklog?plaintext=true&attemptid=attempt_201208130857_0001_m_000000_0&filter=stdout > >>> ---> > >>> > >>> I have changed my hadoop-env.sh acoording to Mathew Buckett in > >>> https://issues.apache.org/jira/browse/HADOOP-7489 > >>> > >>> Also this error of Unable to load realm info from SCDynamicStore does > not > >>> show up when I do 'hadoop namenode -format' or 'start-all.sh' > >>> > >>> I am also attaching a zipped copy of my logs > >>> > >>> > >>> Cheers, > >>> > >>> Subho. > >> > >> > > > > > > -- > Harsh J > --e89a8f502e0ced902f04c8862572 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable No. =A0the content in hadoop-env.sh is like the following: =A0

=A0export HADOOP_NAMENODE_OPTS=3D"-Dcom.sun.management.jmx= remote $HADOOP_NAMENODE_OPTS" =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0=A0
=A0export HADOOP_SECONDARYNAMENODE_OPTS=3D"-Dcom.sun.management.j= mxremote $HADOOP_SECONDARYNAMENODE_OPTS"
=A0export HADOOP_DA= TANODE_OPTS=3D"-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS&qu= ot;
=A0export HADOOP_BALANCER_OPTS=3D"-Dcom.sun.management.jmxremote = $HADOOP_BALANCER_OPTS"
=A0export HADOOP_JOBTRACKER_OPTS=3D&q= uot;-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS"
= =A0export HADOOP_OPTS=3D"-Djava.security.krb5.realm=3DOX.AC.UK -Djava.security.krb5.kdc=3Dkdc0.ox.ac.uk:kdc1.ox.ac.uk"


Best Regards,
Stone



On Fri, Aug 31, 2012 at 3:31 AM, Harsh J= <harsh@cloudera.com> wrote:
Hi Stone,

Do you have any flags in hadoop-env.sh indicating preference of IPv4 over I= Pv6?

On Thu, Aug 30, 2012 at 11:26 PM, Stone <stones.gao@gmail.com> wrote:
> FYI : the jstack output for the hanged job:
>
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.8-b03-424 mixed=
> mode):
>
> "RMI TCP Accept-0" daemon prio=3D9 tid=3D7f833f1ba800 nid=3D= 0x109b90000 runnable
> [109b8f000]
> =A0 =A0java.lang.Thread.State: RUNNABLE
> at java.net.PlainSocketImpl.socketAccept(Native Method)
> at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)
> - locked <7bd6a00a8> (a java.net.SocksSocketImpl)
> at java.net.ServerSocket.implAccept(ServerSocket.java:462)
> at java.net.ServerSocket.accept(ServerSocket.java:430)
> at
> sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMI= ServerSocketFactory.java:34)
> at
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTra= nsport.java:369)
> at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java= :341)
> at java.lang.Thread.run(Thread.java:680)
>
> "Attach Listener" daemon prio=3D9 tid=3D7f833d000800 nid=3D0= x108b72000 waiting on
> condition [00000000]
> =A0 =A0java.lang.Thread.State: RUNNABLE
>
> "LeaseChecker" daemon prio=3D5 tid=3D7f833f30d000 nid=3D0x10= a000000 waiting on
> condition [109fff000]
> =A0 =A0java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.run(DFSClient.java:13= 02)
> at java.lang.Thread.run(Thread.java:680)
>
> "IPC Client (47) connection to localhost/127.0.0.1:9001 from stone" daemon
> prio=3D5 tid=3D7f833d0de000 nid=3D0x109dfa000 in Object.wait() [109df9= 000]
> =A0 =A0java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <7bd6a02e0> (a org.apache.hadoop.ipc.Client$Connect= ion)
> at org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:680= )
> - locked <7bd6a02e0> (a org.apache.hadoop.ipc.Client$Connection)=
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:723)
>
> "sendParams-0" daemon prio=3D5 tid=3D7f833e1cb000 nid=3D0x10= 9cf7000 waiting on
> condition [109cf6000]
> =A0 =A0java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for =A0<7c1950890> (a
> java.util.concurrent.SynchronousQueue$TransferStack)
> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:1= 96)
> at
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(Synch= ronousQueue.java:424)
> at
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(Synchrono= usQueue.java:323)
> at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:87= 4)
> at
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.jav= a:945)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.= java:907)
> at java.lang.Thread.run(Thread.java:680)
>
> "Low Memory Detector" daemon prio=3D5 tid=3D7f833e021800 nid= =3D0x109084000
> runnable [00000000]
> =A0 =A0java.lang.Thread.State: RUNNABLE
>
> "C2 CompilerThread1" daemon prio=3D9 tid=3D7f833e021000 nid= =3D0x108f81000 waiting
> on condition [00000000]
> =A0 =A0java.lang.Thread.State: RUNNABLE
>
> "C2 CompilerThread0" daemon prio=3D9 tid=3D7f833e020000 nid= =3D0x108e7e000 waiting
> on condition [00000000]
> =A0 =A0java.lang.Thread.State: RUNNABLE
>
> "Signal Dispatcher" daemon prio=3D9 tid=3D7f833e01f800 nid= =3D0x108d7b000 runnable
> [00000000]
> =A0 =A0java.lang.Thread.State: RUNNABLE
>
> "Surrogate Locker Thread (Concurrent GC)" daemon prio=3D5 ti= d=3D7f833e01e800
> nid=3D0x108c78000 waiting on condition [00000000]
> =A0 =A0java.lang.Thread.State: RUNNABLE
>
> "Finalizer" daemon prio=3D8 tid=3D7f833e00f000 nid=3D0x108a6= f000 in Object.wait()
> [108a6e000]
> =A0 =A0java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <7c19c1550> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
> - locked <7c19c1550> (a java.lang.ref.ReferenceQueue$Lock)
> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
> at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
>
> "Reference Handler" daemon prio=3D10 tid=3D7f833e00e000 nid= =3D0x10896c000 in
> Object.wait() [10896b000]
> =A0 =A0java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <7c1950860> (a java.lang.ref.Reference$Lock)
> at java.lang.Object.wait(Object.java:485)
> at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) > - locked <7c1950860> (a java.lang.ref.Reference$Lock)
>
> "main" prio=3D5 tid=3D7f833f000800 nid=3D0x1008eb000 waiting= on condition
> [1008e9000]
> =A0 =A0java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at
> org.apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java:1= 316)
> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:509)
> at org.apache.hadoop.examples.WordCount.main(WordCount.java:67)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j= ava:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess= orImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Program= Driver.java:68)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)=
> at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64= )
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j= ava:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess= orImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
>
> "VM Thread" prio=3D9 tid=3D7f833e009800 nid=3D0x108869000 ru= nnable
>
> "Gang worker#0 (Parallel GC Threads)" prio=3D9 tid=3D7f833f0= 02000
> nid=3D0x103ced000 runnable
>
> "Gang worker#1 (Parallel GC Threads)" prio=3D9 tid=3D7f833f0= 02800
> nid=3D0x103df0000 runnable
>
> "Gang worker#2 (Parallel GC Threads)" prio=3D9 tid=3D7f833f0= 03000
> nid=3D0x103ef3000 runnable
>
> "Gang worker#3 (Parallel GC Threads)" prio=3D9 tid=3D7f833f0= 04000
> nid=3D0x103ff6000 runnable
>
> "Concurrent Mark-Sweep GC Thread" prio=3D9 tid=3D7f833f07f00= 0 nid=3D0x1084e7000
> runnable
> "VM Periodic Task Thread" prio=3D10 tid=3D7f833e033800 nid= =3D0x109187000 waiting
> on condition
>
> "Exception Catcher Thread" prio=3D10 tid=3D7f833f001800 nid= =3D0x100b16000 runnable
> JNI global references: 1457
>
>
>
> Best Regards,
> Stone
>
>
>
>
> On Fri, Aug 31, 2012 at 1:24 AM, Stone <stones.gao@gmail.com> wrote:
>>
>> I got the same issue today. map tasks finished quickly but reduce = is
>> always 0%. I am also running Mac OS X 10.8. (cdh3u4)
>>
>> 12/08/31 01:13:03 INFO mapred.JobClient: =A0map 0% reduce 0%
>> 12/08/31 01:13:07 INFO mapred.JobClient: =A0map 100% reduce 0%
>> 12/08/31 01:23:14 INFO mapred.JobClient: Task Id :
>> attempt_201208310112_0001_r_000000_0, Status : FAILED
>> Task attempt_201208310112_0001_r_000000_0 failed to report status = for 600
>> seconds. Killing!
>>
>>
>>
>> logs for the reducer :
>>
>> Task Logs: 'attempt_201208310112_0001_r_000000_0'
>>
>>
>>
>> stdout logs
>>
>>
>> stderr logs
>> 2012-08-31 01:13:06.316 java[46834:1203] Unable to load realm info= from
>> SCDynamicStore
>>
>>
>> syslog logs
>> 2012-08-31 01:13:06,421 INFO
>> org.apache.hadoop.security.UserGroupInformation: JAAS Configuratio= n already
>> set up for Hadoop, not re-installing.
>> 2012-08-31 01:13:06,674 WARN org.apache.hadoop.util.NativeCodeLoad= er:
>> Unable to load native-hadoop library for your platform... using bu= iltin-java
>> classes where applicable
>> 2012-08-31 01:13:06,848 INFO org.apache.hadoop.metrics.jvm.JvmMetr= ics:
>> Initializing JVM Metrics with processName=3DSHUFFLE, sessionId=3D<= br> >> 2012-08-31 01:13:06,945 INFO org.apache.hadoop.mapred.Task: =A0Usi= ng
>> ResourceCalculatorPlugin : null
>> 2012-08-31 01:13:06,957 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> ShuffleRamManager: MemoryLimit=3D144965632, MaxSingleShuffleLimit= =3D36241408
>> 2012-08-31 01:13:06,962 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Thread started: Thread for me= rging
>> on-disk files
>> 2012-08-31 01:13:06,963 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Thread waiting: Thread for me= rging
>> on-disk files
>> 2012-08-31 01:13:06,963 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Thread started: Thread for me= rging in
>> memory files
>> 2012-08-31 01:13:06,964 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Thread started: Thread for po= lling Map
>> Completion Events
>> 2012-08-31 01:13:06,964 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Need another 1 map output(s) = where 0 is
>> already in progress
>> 2012-08-31 01:13:06,965 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Scheduled 0 outputs (0 slow h= osts and0
>> dup hosts)
>> 2012-08-31 01:13:11,966 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Scheduled 1 outputs (0 slow h= osts and0
>> dup hosts)
>> 2012-08-31 01:14:07,996 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Need another 1 map output(s) = where 1 is
>> already in progress
>> 2012-08-31 01:14:07,996 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Scheduled 0 outputs (0 slow h= osts and0
>> dup hosts)
>> 2012-08-31 01:15:08,033 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Need another 1 map output(s) = where 1 is
>> already in progress
>> 2012-08-31 01:15:08,033 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Scheduled 0 outputs (0 slow h= osts and0
>> dup hosts)
>> 2012-08-31 01:16:08,069 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Need another 1 map output(s) = where 1 is
>> already in progress
>> 2012-08-31 01:16:08,070 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Scheduled 0 outputs (0 slow h= osts and0
>> dup hosts)
>> 2012-08-31 01:17:08,106 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Need another 1 map output(s) = where 1 is
>> already in progress
>> 2012-08-31 01:17:08,107 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Scheduled 0 outputs (0 slow h= osts and0
>> dup hosts)
>> 2012-08-31 01:18:08,147 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Need another 1 map output(s) = where 1 is
>> already in progress
>> 2012-08-31 01:18:08,147 INFO org.apache.hadoop.mapred.ReduceTask:<= br> >> attempt_201208310112_0001_r_000000_0 Scheduled 0 outputs (0 slow h= osts and0
>> dup hosts)
>>
>> I can successfully run the wordcount example on my old macbook (os= x 10.6)
>> previously.
>>
>> Any suggestions ?
>>
>> Best Regards,
>> Stone
>>
>>
>>
>>
>> On Mon, Aug 13, 2012 at 12:21 PM, Subho Banerjee <subs.zero@gmail.com>
>> wrote:
>>>
>>> Hello,
>>>
>>> I am running hadoop v1.0.3 in Mac OS X 10.8 with Java_1.6.0_33= -b03-424
>>>
>>>
>>> When running hadoop on pseudo-distributed mode, the map seems = to work,
>>> but it cannot compute the reduce.
>>>
>>> 12/08/13 08:58:12 INFO mapred.JobClient: Running job:
>>> job_201208130857_0001
>>> 12/08/13 08:58:13 INFO mapred.JobClient: map 0% reduce 0%
>>> 12/08/13 08:58:27 INFO mapred.JobClient: map 20% reduce 0%
>>> 12/08/13 08:58:33 INFO mapred.JobClient: map 30% reduce 0%
>>> 12/08/13 08:58:36 INFO mapred.JobClient: map 40% reduce 0%
>>> 12/08/13 08:58:39 INFO mapred.JobClient: map 50% reduce 0%
>>> 12/08/13 08:58:42 INFO mapred.JobClient: map 60% reduce 0%
>>> 12/08/13 08:58:45 INFO mapred.JobClient: map 70% reduce 0%
>>> 12/08/13 08:58:48 INFO mapred.JobClient: map 80% reduce 0%
>>> 12/08/13 08:58:51 INFO mapred.JobClient: map 90% reduce 0%
>>> 12/08/13 08:58:54 INFO mapred.JobClient: map 100% reduce 0% >>> 12/08/13 08:59:14 INFO mapred.JobClient: Task Id :
>>> attempt_201208130857_0001_m_000000_0, Status : FAILED
>>> Too many fetch-failures
>>> 12/08/13 08:59:14 WARN mapred.JobClient: Error reading task ou= tputServer
>>> returned HTTP response code: 403 for URL:
>>> http://10.1.66.17:50060/tasklog?plaintext=3Dtrue&attempt= id=3Dattempt_201208130857_0001_m_000000_0&filter=3Dstdout
>>> 12/08/13 08:59:14 WARN mapred.JobClient: Error reading task ou= tputServer
>>> returned HTTP response code: 403 for URL:
>>> http://10.1.66.17:50060/tasklog?plaintext=3Dtrue&attempt= id=3Dattempt_201208130857_0001_m_000000_0&filter=3Dstderr
>>> 12/08/13 08:59:18 INFO mapred.JobClient: map 89% reduce 0%
>>> 12/08/13 08:59:21 INFO mapred.JobClient: map 100% reduce 0% >>> 12/08/13 09:00:14 INFO mapred.JobClient: Task Id :
>>> attempt_201208130857_0001_m_000001_0, Status : FAILED
>>> Too many fetch-failures
>>>
>>> Here is what I get when I try to see the tasklog using the lin= ks given in
>>> the output
>>>
>>>
>>> http://10.1.66.17:50060/tasklog?plaintext=3Dtrue&attempt= id=3Dattempt_201208130857_0001_m_000000_0&filter=3Dstderr
>>> --->
>>> 2012-08-13 08:58:39.189 java[74092:1203] Unable to load realm = info from
>>> SCDynamicStore
>>>
>>>
>>> http://10.1.66.17:50060/tasklog?plaintext=3Dtrue&attempt= id=3Dattempt_201208130857_0001_m_000000_0&filter=3Dstdout
>>> --->
>>>
>>> I have changed my hadoop-env.sh acoording to Mathew Buckett in=
>>> https://issues.apache.org/jira/browse/HADOOP-7489
>>>
>>> Also this error of Unable to load realm info from SCDynamicSto= re does not
>>> show up when I do 'hadoop namenode -format' or 'st= art-all.sh'
>>>
>>> I am also attaching a zipped copy of my logs
>>>
>>>
>>> Cheers,
>>>
>>> Subho.
>>
>>
>



--
Harsh J

--e89a8f502e0ced902f04c8862572--