Return-Path: Delivered-To: apmail-lucene-hadoop-user-archive@locus.apache.org Received: (qmail 39427 invoked from network); 2 Jan 2008 11:14:03 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 2 Jan 2008 11:14:03 -0000 Received: (qmail 76791 invoked by uid 500); 2 Jan 2008 11:13:50 -0000 Delivered-To: apmail-lucene-hadoop-user-archive@lucene.apache.org Received: (qmail 76759 invoked by uid 500); 2 Jan 2008 11:13:50 -0000 Mailing-List: contact hadoop-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-user@lucene.apache.org Delivered-To: mailing list hadoop-user@lucene.apache.org Received: (qmail 76750 invoked by uid 99); 2 Jan 2008 11:13:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Jan 2008 03:13:50 -0800 X-ASF-Spam-Status: No, hits=2.6 required=10.0 tests=DNS_FROM_OPENWHOIS,SPF_HELO_PASS,SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Jan 2008 11:13:38 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1JA1XN-0006we-94 for hadoop-user@lucene.apache.org; Wed, 02 Jan 2008 03:13:29 -0800 Message-ID: <14576700.post@talk.nabble.com> Date: Wed, 2 Jan 2008 03:13:29 -0800 (PST) From: Dhaya007 To: hadoop-user@lucene.apache.org Subject: Re: Not able to start Data Node In-Reply-To: <583355c00801012229p174d8793wdde79fe8e68569b7@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-From: mgdhayal@gmail.com References: <14573889.post@talk.nabble.com> <583355c00801012229p174d8793wdde79fe8e68569b7@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org Thanks for your reply i am using password less ssh master to slave and following are the logs (slave) ..datanode-slave.log 2007-12-19 19:30:55,237 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting DataNode STARTUP_MSG: host = slave/172.16.0.58 STARTUP_MSG: args = [] ************************************************************/ 2007-12-19 19:30:55,579 WARN org.apache.hadoop.dfs.DataNode: Invalid directory in dfs.data.dir: directory is not writable: /tmp/hadoop-hdpusr/dfs/data 2007-12-19 19:30:55,579 ERROR org.apache.hadoop.dfs.DataNode: All directories in dfs.data.dir are invalid. 2007-12-19 19:30:55,582 INFO org.apache.hadoop.dfs.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at slave/172.16.0.58 ************************************************************/ Tasktracker_slav.log 2008-01-02 15:10:10,634 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting TaskTracker STARTUP_MSG: host = slave/172.16.0.58 STARTUP_MSG: args = [] ************************************************************/ 2008-01-02 15:10:32,024 INFO org.mortbay.util.Credential: Checking Resource aliases 2008-01-02 15:10:32,368 INFO org.mortbay.http.HttpServer: Version Jetty/5.1.4 2008-01-02 15:10:33,853 INFO org.mortbay.util.Container: Started org.mortbay.jetty.servlet.WebApplicationHandler@1ce784b 2008-01-02 15:10:34,039 INFO org.mortbay.util.Container: Started WebApplicationContext[/,/] 2008-01-02 15:10:34,039 INFO org.mortbay.util.Container: Started HttpContext[/logs,/logs] 2008-01-02 15:10:34,040 INFO org.mortbay.util.Container: Started HttpContext[/static,/static] 2008-01-02 15:10:34,052 INFO org.mortbay.http.SocketListener: Started SocketListener on 0.0.0.0:50060 2008-01-02 15:10:34,052 INFO org.mortbay.util.Container: Started org.mortbay.jetty.Server@1827284 2008-01-02 15:10:34,101 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=TaskTracker, sessionId= 2008-01-02 15:10:34,235 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: /127.0.0.1:32772 2008-01-02 15:10:34,235 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_slave:/127.0.0.1:32772 2008-01-02 15:10:34,246 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 32772: starting 2008-01-02 15:10:34,247 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 32772: starting 2008-01-02 15:10:34,248 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 32772: starting 2008-01-02 15:10:34,419 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.net.UnknownHostException: unknown host: localhost at org.apache.hadoop.ipc.Client$Connection.(Client.java:136) at org.apache.hadoop.ipc.Client.getConnection(Client.java:532) at org.apache.hadoop.ipc.Client.call(Client.java:471) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184) at org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:293) at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:246) at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:427) at org.apache.hadoop.mapred.TaskTracker.(TaskTracker.java:717) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1880) 2008-01-02 15:10:34,420 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down TaskTracker at slave/172.16.0.58 ************************************************************/ And all the ports are running Some time it asks password and some time it wont while starting the dfs Master logs namenode-master.log 2008-01-02 14:44:01,017 INFO org.apache.hadoop.dfs.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = master/172.16.0.25 STARTUP_MSG: args = [] ************************************************************/ 2008-01-02 14:44:02,453 INFO org.apache.hadoop.dfs.NameNode: Namenode up at: localhost.localdomain/127.0.0.1:54310 2008-01-02 14:44:02,458 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=NameNode, sessionId=null 2008-01-02 14:44:02,636 INFO org.apache.hadoop.dfs.Storage: Storage directory /tmp/hadoop-hdpusr/dfs/name does not exist. 2008-01-02 14:44:02,638 INFO org.apache.hadoop.ipc.Server: Stopping server on 54310 2008-01-02 14:44:02,653 ERROR org.apache.hadoop.dfs.NameNode: org.apache.hadoop.dfs.InconsistentFSStateException: Directory /tmp/hadoop-hdpusr/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible. at org.apache.hadoop.dfs.FSImage.recoverTransitionRead(FSImage.java:153) at org.apache.hadoop.dfs.FSDirectory.loadFSImage(FSDirectory.java:76) at org.apache.hadoop.dfs.FSNamesystem.(FSNamesystem.java:221) at org.apache.hadoop.dfs.NameNode.init(NameNode.java:130) at org.apache.hadoop.dfs.NameNode.(NameNode.java:168) at org.apache.hadoop.dfs.NameNode.createNameNode(NameNode.java:804) at org.apache.hadoop.dfs.NameNode.main(NameNode.java:813) 2008-01-02 14:44:02,677 INFO org.apache.hadoop.dfs.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at master/172.16.0.25 ************************************************************/ Datanode-master.log 2008-01-02 16:26:32,380 INFO org.apache.hadoop.ipc.RPC: Server at localhost/127.0.0.1:54310 not available yet, Zzzzz... 2008-01-02 16:26:33,390 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 1 time(s). 2008-01-02 16:26:34,400 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 2 time(s). 2008-01-02 16:26:35,410 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 3 time(s). 2008-01-02 16:26:36,420 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 4 time(s). *********************************************** Jobtracker_master.log 2008-01-02 16:25:41,040 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 10 time(s). 2008-01-02 16:25:42,050 INFO org.apache.hadoop.mapred.JobTracker: problem cleaning system directory: /tmp/hadoop-hdpusr/mapred/system java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:520) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:152) at org.apache.hadoop.ipc.Client.getConnection(Client.java:542) at org.apache.hadoop.ipc.Client.call(Client.java:471) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184) at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269) at org.apache.hadoop.dfs.DFSClient.createNamenode(DFSClient.java:147) at org.apache.hadoop.dfs.DFSClient.(DFSClient.java:161) at org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSystem.java:65) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:159) at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90) at org.apache.hadoop.mapred.JobTracker.(JobTracker.java:683) at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:120) at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:2052) 2008-01-02 16:25:42,931 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 54311, call getFilesystemName() from 127.0.0.1:49283: error: org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object not available yet org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object not available yet at org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475) at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) 2008-01-02 16:25:47,942 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 54311, call getFilesystemName() from 127.0.0.1:49293: error: org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object not available yet org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object not available yet at org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475) at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) 2008-01-02 16:25:52,061 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 1 time(s). 2008-01-02 16:25:52,951 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 54311, call getFilesystemName() from 127.0.0.1:49304: error: org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object not available yet org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object not available yet at org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475) at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) 2008-01-02 16:25:53,070 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 2 time(s). 2008-01-02 16:25:54,080 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 3 time(s). 2008-01-02 16:25:55,090 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 4 time(s). 2008-01-02 16:25:56,100 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Already tried 5 time(s). 2008-01-02 16:25:56,281 INFO org.apache.hadoop.mapred.JobTracker: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down JobTracker at master/172.16.0.25 ************************************************************/ Tasktracker_master.log 2008-01-02 16:26:14,080 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/127.0.0.1:54311. Already tried 2 time(s). 2008-01-02 16:28:34,510 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting TaskTracker STARTUP_MSG: host = master/172.16.0.25 STARTUP_MSG: args = [] ************************************************************/ 2008-01-02 16:28:34,739 INFO org.mortbay.util.Credential: Checking Resource aliases 2008-01-02 16:28:34,827 INFO org.mortbay.http.HttpServer: Version Jetty/5.1.4 2008-01-02 16:28:35,281 INFO org.mortbay.util.Container: Started org.mortbay.jetty.servlet.WebApplicationHandler@89cc5e 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started WebApplicationContext[/,/] 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started HttpContext[/logs,/logs] 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started HttpContext[/static,/static] 2008-01-02 16:28:35,336 INFO org.mortbay.http.SocketListener: Started SocketListener on 0.0.0.0:50060 2008-01-02 16:28:35,336 INFO org.mortbay.util.Container: Started org.mortbay.jetty.Server@1431340 2008-01-02 16:28:35,383 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=TaskTracker, sessionId= 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: /127.0.0.1:49599 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_master:/127.0.0.1:49599 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 49599: starting 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 49599: starting 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 49599: starting 2008-01-02 16:28:35,490 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_master:/127.0.0.1:49599 2008-01-02 16:28:35,500 INFO org.apache.hadoop.mapred.TaskTracker: Lost connection to JobTracker [localhost/127.0.0.1:54311]. Retrying... org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem object not available yet at org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596) at org.apache.hadoop.ipc.Client.call(Client.java:482) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184) at org.apache.hadoop.mapred.$Proxy0.getFilesystemName(Unknown Source) at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:773) at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1179) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1880) ******************************************* Please help me to resolve the same. Khalil Honsali wrote: > > Hi, > > I think you need to post more information, for example an excerpt of the > failing datanode log. Also, please clarify the issue of connectivity: > - are you able to ssh passwordless (from master to slave, slave to master, > slave to slave, master to master), you shouldn't be entering passwrd > everytime... > - are you able to telnet (not necessary but preferred) > - have you verified the ports as RUNNING on using netstat command? > > besides, the tasktracker starts ok but not the datanode? > > K. Honsali > > On 02/01/2008, Dhaya007 wrote: >> >> >> I am new to hadoop if any think wrong please correct me .... >> I Have configured a single/multi node cluster using following link >> >> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29 >> . >> I have followed the link but i am not able to start the haoop in multi >> node >> environment >> The problems i am facing are as Follows: >> 1.I have configured master and slave nodes with ssh less pharase if try >> to >> run the start-dfs.sh it prompt the password for master:slave machines.(I >> have copied the .ssh/id_rsa.pub key of master in to slaves autherized_key >> file) >> >> 2.After giving password datanode,namenode,jobtracker,tasktraker started >> successfully in master but datanode is started in slave. >> >> >> 3.Some time step 2 works and some time it says that permission denied. >> >> 4.I have checked the log file in the slave for datanode it says that >> incompatible node, then i have formated the slave, master and start the >> dfs >> by start-dfs.sh still i am getting the error >> >> >> The host entry in etc/hosts are both master/slave >> master >> slave >> conf/masters >> master >> conf/slaves >> master >> slave >> >> The hadoop-site.xml for both master/slave >> >> >> >> >> >> >> >> hadoop.tmp.dir >> /home/hdusr/hadoop-${user.name} >> A base for other temporary directories. >> >> >> >> fs.default.name >> hdfs://master:54310 >> The name of the default file system. A URI whose >> scheme and authority determine the FileSystem implementation. The >> uri's scheme determines the config property (fs.SCHEME.impl) naming >> the FileSystem implementation class. The uri's authority is used to >> determine the host, port, etc. for a filesystem. >> >> >> >> mapred.job.tracker >> master:54311 >> The host and port that the MapReduce job tracker runs >> at. If "local", then jobs are run in-process as a single map >> and reduce task. >> >> >> >> >> dfs.replication >> 2 >> Default block replication. >> The actual number of replications can be specified when the file is >> created. >> The default is used if replication is not specified in create time. >> >> >> >> >> mapred.map.tasks >> 20 >> As a rule of thumb, use 10x the number of slaves (i.e., >> number of tasktrackers). >> >> >> >> >> mapred.reduce.tasks >> 4 >> As a rule of thumb, use 2x the number of slave processors >> (i.e., number of tasktrackers). >> >> >> >> >> Please help me to reslove the same. Or else provide any other tutorial >> for >> multi node cluster setup.I am egarly waiting for the tutorials. >> >> >> Thanks >> >> -- >> View this message in context: >> http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14573889.html >> Sent from the Hadoop Users mailing list archive at Nabble.com. >> >> > > -- View this message in context: http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14576700.html Sent from the Hadoop Users mailing list archive at Nabble.com.