From general-return-283-apmail-hadoop-general-archive=hadoop.apache.org@hadoop.apache.org Tue Jun 30 15:30:53 2009 Return-Path: Delivered-To: apmail-hadoop-general-archive@minotaur.apache.org Received: (qmail 49997 invoked from network); 30 Jun 2009 15:30:52 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 30 Jun 2009 15:30:52 -0000 Received: (qmail 4469 invoked by uid 500); 30 Jun 2009 15:31:03 -0000 Delivered-To: apmail-hadoop-general-archive@hadoop.apache.org Received: (qmail 4433 invoked by uid 500); 30 Jun 2009 15:31:03 -0000 Mailing-List: contact general-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@hadoop.apache.org Delivered-To: mailing list general@hadoop.apache.org Received: (qmail 4423 invoked by uid 99); 30 Jun 2009 15:31:03 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Jun 2009 15:31:03 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [12.110.209.161] (HELO usausmgw01.spansion.com) (12.110.209.161) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Jun 2009 15:30:53 +0000 X-IronPort-AV: E=McAfee;i="5300,2777,5661"; a="5251915" Received: from usausexbh2.spansion.com ([10.248.26.116]) by usausmgw01.spansion.com with ESMTP; 30 Jun 2009 08:30:31 -0700 Received: from USAUSEXMBPF2.spansion.com ([10.248.26.56]) by usausexbh2.spansion.com with Microsoft SMTPSVC(6.0.3790.3959); Tue, 30 Jun 2009 10:30:31 -0500 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: quoted-printable X-MimeOLE: Produced By Microsoft Exchange V6.5 Subject: RE: problem with running WordCode v0 with a distributed operation Date: Tue, 30 Jun 2009 10:30:31 -0500 Message-ID: In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: problem with running WordCode v0 with a distributed operation thread-index: Acn5UpbgRxUobXuTT6qUIW9qyw4lIwAQ6XFw References: From: "Gross, Danny" To: X-OriginalArrivalTime: 30 Jun 2009 15:30:31.0339 (UTC) FILETIME=[B4147FB0:01C9F997] X-Virus-Checked: Checked by ClamAV on apache.org Hello, I suggest that you check iptables on your systems. At one time, one of my nodes showed a similar error, and this was the culprit. Good luck, Danny -----Original Message----- From: C J [mailto:xine.jar@googlemail.com]=20 Sent: Tuesday, June 30, 2009 2:15 AM To: general@hadoop.apache.org; C J Subject: problem with running WordCode v0 with a distributed operation Help help help, I am a new user, it has been already 2-3 weeks that I am trying to run Hadoop 0.18.3. *My settings:* 1. I am on LINUX 2. Using the Java 1.6.0 3. I have unpacked the hadoop-0.18.3 folder directly on the desktop *Working steps:* 1. I have succeeded to run it on the local 2. I went through the quick start tutorial and managed to operate Hadoop in the standalone and the Pseudo-distributed motes 3. I started going through the map/reduce tutorial and managed to run the WordCount v1.0 with a standalone operation. *Current status:* 1. I would like to run the WordCount v1.0 example on a distributed operation 2. I went through the steps in the cluster setup tutorial and did the following: - On a distant server I am running 4 virtual machines: 134.130.223.58:1 134.130.223.72:1 134.130.223.85:1 134.130.223.92:1 - My own machine has the following ip address 134.130.222.54 - The hadoop-en.sh file is the same on all the five machines: *export HADOOP_HEAPSIZE=3D500* *export JAVA_HOME=3D"/usr/java/jdk1.6.0_14"* *export HADOOP_NAMENODE_OPTS=3D"-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS"* *export = HADOOP_SECONDARYNAMENODE_OPTS=3D"-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS"* *export HADOOP_DATANODE_OPTS=3D"-Dcom.sun.management.jmxremote * *$HADOOP_DATANODE_OPTS"* *export HADOOP_BALANCER_OPTS=3D"-Dcom.sun.management.jmxremote * *$HADOOP_BALANCER_OPTS"* *export HADOOP_JOBTRACKER_OPTS=3D"-Dcom.sun.management.jmxremote * *$HADOOP_JOBTRACKER_OPTS"* *export HADOOP_HOME=3D"/root/Desktop/hadoop-0.18.3"* *export HADOOP_VERSION=3D"0.18.3"* *export HADOOP_LOG_DIR=3D${HADOOP_HOME}/logs* - The hadoop-site.xml is the same on all the five machines: ** ** ** ** * * * Hadoop Quickstart* * Page 3* * Copyright (c) 2007 The Apache Software Foundation. All rights reserved.* * fs.default.name* * 134.130.223.85:9000* * * * * * mapred.job.tracker* * 134.130.223.58:1* * * * * * dfs.name.dir* * /root/Desktop/hadoop-0.18.3/logstina* * * * * * dfs.data.dir* * /root/Desktop/hadoop-0.18.3/blockstina* * * * * * mapred.system.dir* * systemtina* * * * * * mapred.local.dir* * /root/Desktop/hadoop-0.18.3/tempMapReducetina* * * * * - The slaves file is the same on all five machines and contains: 134.130.223.72 134.130.222.54 134.130.223.92 - from the Namenode machine 134.130.223.85 I have formatted a new namenode, started the hdfs (bin/start-dfs.sh) and started the Map-reduce (bin/start-mapred.sh) *Problem:* Si Since the jar file of the WordCount was already created (by the local operation) in the folder us usr/tina, I tried running directly the application similarly to local operation by typing *$bin/hadoop jar usr/tina/wordcount.jar org.myorg.WordCount usr/tina/wordcount/input usr/tina/wordcount/output.* Then I got the following error: *09/06/29 19:38:05 WARN fs.FileSystem: "134.130.223.85:9000" is a deprecated filesystem name. Use "hdfs://134.130.223.85:9000/" instead.* *09/06/29 19:38:06 INFO ipc.Client: Retrying connect to server: / 134.130.223.85:9000. Already tried 0 time(s).* *09/06/29 19:38:07 INFO ipc.Client: Retrying connect to server: / 134.130.223.85:9000. Already tried 1 time(s).* *09/06/29 19:38:08 INFO ipc.Client: Retrying connect to server: / 134.130.223.85:9000. Already tried 2 time(s).* *09/06/29 19:38:09 INFO ipc.Client: Retrying connect to server: / 134.130.223.85:9000. Already tried 3 time(s).* *09/06/29 19:38:10 INFO ipc.Client: Retrying connect to server: / 134.130.223.85:9000. Already tried 4 time(s).* *09/06/29 19:38:11 INFO ipc.Client: Retrying connect to server: / 134.130.223.85:9000. Already tried 5 time(s).* *09/06/29 19:38:12 INFO ipc.Client: Retrying connect to server: / 134.130.223.85:9000. Already tried 6 time(s).* *09/06/29 19:38:13 INFO ipc.Client: Retrying connect to server: / 134.130.223.85:9000. Already tried 7 time(s).* *09/06/29 19:38:14 INFO ipc.Client: Retrying connect to server: / 134.130.223.85:9000. Already tried 8 time(s).* *09/06/29 19:38:15 INFO ipc.Client: Retrying connect to server: / 134.130.223.85:9000. Already tried 9 time(s).* *java.lang.RuntimeException: java.net.ConnectException: Call to / 134.130.223.85:9000 failed on connection exception: java.net.ConnectException: Connection refused* *at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:358)* *at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.j ava:377) * *at org.myorg.WordCount.main(WordCount.java:53)* *at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)* *at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav a:39) * *at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor Impl.java:25) * *at java.lang.reflect.Method.invoke(Method.java:597)* *at org.apache.hadoop.util.RunJar.main(RunJar.java:155)* *at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)* *at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)* *at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)* *at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)* *Caused by: java.net.ConnectException: Call to /134.130.223.85:9000 failed on connection exception: java.net.ConnectException: Connection refused* *at org.apache.hadoop.ipc.Client.wrapException(Client.java:743)* *at org.apache.hadoop.ipc.Client.call(Client.java:719)* *at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)* *at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown Source)* *at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:348)* *at org.apache.hadoop.dfs.DFSClient.createRPCNamenode(DFSClient.java:103)* *at org.apache.hadoop.dfs.DFSClient.(DFSClient.java:172)* *at org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSy stem.java:67) * *at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1339)* *at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:56)* *at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1351)* *at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:213)* *at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:118)* *at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:354)* *... 11 more* *Caused by: java.net.ConnectException: Connection refused* *at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)* *at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)* *at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)* *at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:301)* *at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:178)* *at org.apache.hadoop.ipc.Client.getConnection(Client.java:820)* *at org.apache.hadoop.ipc.Client.call(Client.java:705)* *... 23 more* * Questions* 1. Can someone help me in solving/debugging this problem? P.S: I have tried to stop the HDFS with bin/stop-dfs.sh before starting new ones. Thank you, C.J