hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dhaya007 <mgdha...@gmail.com>
Subject Re: Not able to start Data Node
Date Wed, 02 Jan 2008 12:49:08 GMT



Arun C Murthy wrote:
> 
> What version of Hadoop are you running?
> Dhaya007:hadoop-0.15.1
> 
> http://wiki.apache.org/lucene-hadoop/Help
> 
> Dhaya007 wrote:
>  > ..datanode-slave.log
>> 2007-12-19 19:30:55,579 WARN org.apache.hadoop.dfs.DataNode: Invalid
>> directory in dfs.data.dir: directory is not writable:
>> /tmp/hadoop-hdpusr/dfs/data
>> 2007-12-19 19:30:55,579 ERROR org.apache.hadoop.dfs.DataNode: All
>> directories in dfs.data.dir are invalid.
> 
> Did you check that directory?
> Daya007:Yes, i have checked the folder in which there is no file saved.
> 
> DataNode is complaining that it doesn't have any 'valid' directories to 
> store data in.
> 
>> Tasktracker_slav.log
>> 2008-01-02 15:10:34,419 ERROR org.apache.hadoop.mapred.TaskTracker: Can
>> not
>> start task tracker because java.net.UnknownHostException: unknown host:
>> localhost
>> 	at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:136)
>> 	at org.apache.hadoop.ipc.Client.getConnection(Client.java:532)
>> 	at org.apache.hadoop.ipc.Client.call(Client.java:471)
>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>> 	at org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown Source)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:293)
>> 	at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:246)
>> 	at org.apache.hadoop.mapred.TaskTracker.initialize(TaskTracker.java:427)
>> 	at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:717)
>> 	at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1880)
>> 
> 
> That probably means that the TaskTracker's hadoop-site.xml says that 
> 'localhost' is the JobTracker which isn't true...
> 
> hadoop-site.xml is as follows
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> 
> <!-- Put site-specific property overrides in this file. -->
> 
> <configuration>
> <property>
>   <name>hadoop.tmp.dir</name>
>   <value>/home/hdusr/hadoop-${user.name}</value>
>   <description>A base for other temporary directories.</description>
> </property>
>  
> <property>
>   <name>fs.default.name</name>
>   <value>hdfs://master:54310</value>
>   <description>The name of the default file system.  A URI whose
>   scheme and authority determine the FileSystem implementation.  The
>   uri's scheme determines the config property (fs.SCHEME.impl) naming
>   the FileSystem implementation class.  The uri's authority is used to
>   determine the host, port, etc. for a filesystem.</description>
> </property>
>  
> <property>
>   <name>mapred.job.tracker</name>
>   <value>master:54311</value>
>   <description>The host and port that the MapReduce job tracker runs
>   at.  If "local", then jobs are run in-process as a single map
>   and reduce task.
>   </description>
> </property>
>  
> <property>
>   <name>dfs.replication</name>
>   <value>2</value>
>   <description>Default block replication.
>   The actual number of replications can be specified when the file is
> created.
>   The default is used if replication is not specified in create time.
>   </description>
> </property>
> 
> <property>
>   <name>mapred.map.tasks</name>
>   <value>20</value>
>   <description>As a rule of thumb, use 10x the number of slaves (i.e.,
> number of tasktrackers).
>   </description>
> </property>
> 
> <property>
>   <name>mapred.reduce.tasks</name>
>   <value>4</value>
>   <description>As a rule of thumb, use 2x the number of slave processors
> (i.e., number of tasktrackers).
>   </description>
> </property>
> </configuration>
> 
>  > namenode-master.log
>  > 2008-01-02 14:44:02,636 INFO org.apache.hadoop.dfs.Storage: Storage
>  > directory /tmp/hadoop-hdpusr/dfs/name does not exist.
>  > 2008-01-02 14:44:02,638 INFO org.apache.hadoop.ipc.Server: Stopping 
> server
>  > on 54310
>  > 2008-01-02 14:44:02,653 ERROR org.apache.hadoop.dfs.NameNode:
>  > org.apache.hadoop.dfs.InconsistentFSStateException: Directory
>  > /tmp/hadoop-hdpusr/dfs/name is in an inconsistent state: storage 
> directory
>  > does not exist or is not accessible.
> 
> That means that, /tmp/hadoop-hdpusr/dfs/name doesn't exist or isn't 
> accessible.
> 
> Dhaya007 I have checked the name folder but i wont find any folder in the
> specified dir
> -*-*-
> 
> Overall, this looks like an acute case of wrong-configuration-itis.
> Please provid the corect configuration site example for multi node cluster
> other than 
> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
> because i followed the same
> 
> Have you got the same hadoop-site.xml on all your nodes?
> Dhaya007:Yes
> 
> More info here: 
> http://lucene.apache.org/hadoop/docs/r0.15.1/cluster_setup.html
> Dhaya007: I followed the same site you have mentioned but no solution
> 
> Arun
> 
> 
>> 2008-01-02 15:10:34,420 INFO org.apache.hadoop.mapred.TaskTracker:
>> SHUTDOWN_MSG: 
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down TaskTracker at slave/172.16.0.58
>> ************************************************************/
>> 
>> 
>> And all the ports are running 
>> Some time it asks password and some time it wont while starting the dfs
>> 
>> Master logs
>> 2008-01-02 14:44:02,677 INFO org.apache.hadoop.dfs.NameNode:
>> SHUTDOWN_MSG: 
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down NameNode at master/172.16.0.25
>> ************************************************************/
>> 
>> Datanode-master.log
>> 2008-01-02 16:26:32,380 INFO org.apache.hadoop.ipc.RPC: Server at
>> localhost/127.0.0.1:54310 not available yet, Zzzzz...
>> 2008-01-02 16:26:33,390 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
>> 2008-01-02 16:26:34,400 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
>> 2008-01-02 16:26:35,410 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
>> 2008-01-02 16:26:36,420 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
>> ***********************************************
>> Jobtracker_master.log
>> 2008-01-02 16:25:41,040 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 10 time(s).
>> 2008-01-02 16:25:42,050 INFO org.apache.hadoop.mapred.JobTracker: problem
>> cleaning system directory: /tmp/hadoop-hdpusr/mapred/system
>> java.net.ConnectException: Connection refused
>> 	at java.net.PlainSocketImpl.socketConnect(Native Method)
>> 	at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
>> 	at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
>> 	at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
>> 	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>> 	at java.net.Socket.connect(Socket.java:520)
>> 	at
>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:152)
>> 	at org.apache.hadoop.ipc.Client.getConnection(Client.java:542)
>> 	at org.apache.hadoop.ipc.Client.call(Client.java:471)
>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>> 	at org.apache.hadoop.dfs.$Proxy0.getProtocolVersion(Unknown Source)
>> 	at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:269)
>> 	at org.apache.hadoop.dfs.DFSClient.createNamenode(DFSClient.java:147)
>> 	at org.apache.hadoop.dfs.DFSClient.<init>(DFSClient.java:161)
>> 	at
>> org.apache.hadoop.dfs.DistributedFileSystem.initialize(DistributedFileSystem.java:65)
>> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:159)
>> 	at org.apache.hadoop.fs.FileSystem.getNamed(FileSystem.java:118)
>> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:90)
>> 	at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:683)
>> 	at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:120)
>> 	at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:2052)
>> 2008-01-02 16:25:42,931 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 5 on 54311, call getFilesystemName() from 127.0.0.1:49283: error:
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>> object
>> not available yet
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>> object
>> not available yet
>> 	at
>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
>> 	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>> 	at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> 	at java.lang.reflect.Method.invoke(Method.java:585)
>> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>> 2008-01-02 16:25:47,942 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 6 on 54311, call getFilesystemName() from 127.0.0.1:49293: error:
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>> object
>> not available yet
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>> object
>> not available yet
>> 	at
>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
>> 	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>> 	at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> 	at java.lang.reflect.Method.invoke(Method.java:585)
>> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>> 2008-01-02 16:25:52,061 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 1 time(s).
>> 2008-01-02 16:25:52,951 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 7 on 54311, call getFilesystemName() from 127.0.0.1:49304: error:
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>> object
>> not available yet
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>> object
>> not available yet
>> 	at
>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
>> 	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>> 	at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> 	at java.lang.reflect.Method.invoke(Method.java:585)
>> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>> 2008-01-02 16:25:53,070 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 2 time(s).
>> 2008-01-02 16:25:54,080 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 3 time(s).
>> 2008-01-02 16:25:55,090 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 4 time(s).
>> 2008-01-02 16:25:56,100 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54310. Already tried 5 time(s).
>> 2008-01-02 16:25:56,281 INFO org.apache.hadoop.mapred.JobTracker:
>> SHUTDOWN_MSG: 
>> /************************************************************
>> SHUTDOWN_MSG: Shutting down JobTracker at master/172.16.0.25
>> ************************************************************/
>> 
>> Tasktracker_master.log
>> 2008-01-02 16:26:14,080 INFO org.apache.hadoop.ipc.Client: Retrying
>> connect
>> to server: localhost/127.0.0.1:54311. Already tried 2 time(s).
>> 2008-01-02 16:28:34,510 INFO org.apache.hadoop.mapred.TaskTracker:
>> STARTUP_MSG: 
>> /************************************************************
>> STARTUP_MSG: Starting TaskTracker
>> STARTUP_MSG:   host = master/172.16.0.25
>> STARTUP_MSG:   args = []
>> ************************************************************/
>> 2008-01-02 16:28:34,739 INFO org.mortbay.util.Credential: Checking
>> Resource
>> aliases
>> 2008-01-02 16:28:34,827 INFO org.mortbay.http.HttpServer: Version
>> Jetty/5.1.4
>> 2008-01-02 16:28:35,281 INFO org.mortbay.util.Container: Started
>> org.mortbay.jetty.servlet.WebApplicationHandler@89cc5e
>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
>> WebApplicationContext[/,/]
>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
>> HttpContext[/logs,/logs]
>> 2008-01-02 16:28:35,332 INFO org.mortbay.util.Container: Started
>> HttpContext[/static,/static]
>> 2008-01-02 16:28:35,336 INFO org.mortbay.http.SocketListener: Started
>> SocketListener on 0.0.0.0:50060
>> 2008-01-02 16:28:35,336 INFO org.mortbay.util.Container: Started
>> org.mortbay.jetty.Server@1431340
>> 2008-01-02 16:28:35,383 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>> Initializing JVM Metrics with processName=TaskTracker, sessionId=
>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker:
>> TaskTracker up at: /127.0.0.1:49599
>> 2008-01-02 16:28:35,402 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting
>> tracker tracker_master:/127.0.0.1:49599
>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
>> listener on 49599: starting
>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 0 on 49599: starting
>> 2008-01-02 16:28:35,406 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 49599: starting
>> 2008-01-02 16:28:35,490 INFO org.apache.hadoop.mapred.TaskTracker:
>> Starting
>> thread: Map-events fetcher for all reduce tasks on
>> tracker_master:/127.0.0.1:49599
>> 2008-01-02 16:28:35,500 INFO org.apache.hadoop.mapred.TaskTracker: Lost
>> connection to JobTracker [localhost/127.0.0.1:54311].  Retrying...
>> org.apache.hadoop.ipc.RemoteException:
>> org.apache.hadoop.mapred.JobTracker$IllegalStateException: FileSystem
>> object
>> not available yet
>> 	at
>> org.apache.hadoop.mapred.JobTracker.getFilesystemName(JobTracker.java:1475)
>> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> 	at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> 	at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> 	at java.lang.reflect.Method.invoke(Method.java:585)
>> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>> 
>> 	at org.apache.hadoop.ipc.Client.call(Client.java:482)
>> 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>> 	at org.apache.hadoop.mapred.$Proxy0.getFilesystemName(Unknown Source)
>> 	at
>> org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:773)
>> 	at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1179)
>> 	at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:1880)
>> *******************************************
>> 
>> Please help me to resolve the same.
>> 
>> 
>> Khalil Honsali wrote:
>> 
>>>Hi,
>>>
>>>I think you need to post more information, for example an excerpt of the
>>>failing datanode log. Also, please clarify the issue of connectivity:
>>>- are you able to ssh passwordless (from master to slave, slave to
master,
>>>slave to slave, master to master), you shouldn't be entering passwrd
>>>everytime...
>>>- are you able to telnet (not necessary but preferred)
>>>- have you verified the ports as RUNNING on using netstat command?
>>>
>>>besides, the tasktracker starts ok but not the datanode?
>>>
>>>K. Honsali
>>>
>>>On 02/01/2008, Dhaya007 <mgdhayal@gmail.com> wrote:
>>>
>>>>
>>>>I am new to hadoop if any think wrong please correct me ....
>>>>I Have configured a single/multi node cluster using following link
>>>>
>>>>http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29
>>>>.
>>>>I have followed the link but i am not able to start the haoop in multi
>>>>node
>>>>environment
>>>>The problems i am facing are as Follows:
>>>>1.I have configured master and slave nodes with ssh less pharase if try
>>>>to
>>>>run the start-dfs.sh it prompt the password for master:slave machines.(I
>>>>have copied the .ssh/id_rsa.pub key of master in to slaves
autherized_key
>>>>file)
>>>>
>>>>2.After giving password datanode,namenode,jobtracker,tasktraker started
>>>>successfully in master but datanode is started in slave.
>>>>
>>>>
>>>>3.Some time step 2 works and some time it says that permission denied.
>>>>
>>>>4.I have checked the log file in the slave for datanode it says that
>>>>incompatible node, then i have formated the slave, master and start the
>>>>dfs
>>>>by start-dfs.sh still i am getting the error
>>>>
>>>>
>>>>The host entry in etc/hosts are both master/slave
>>>>master
>>>>slave
>>>>conf/masters
>>>>master
>>>>conf/slaves
>>>>master
>>>>slave
>>>>
>>>>The hadoop-site.xml  for both master/slave
>>>><?xml version="1.0"?>
>>>><?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>>>>
>>>><!-- Put site-specific property overrides in this file. -->
>>>>
>>>><configuration>
>>>><property>
>>>>  <name>hadoop.tmp.dir</name>
>>>>  <value>/home/hdusr/hadoop-${user.name}</value>
>>>>  <description>A base for other temporary directories.</description>
>>>></property>
>>>>
>>>><property>
>>>>  <name>fs.default.name</name>
>>>>  <value>hdfs://master:54310</value>
>>>>  <description>The name of the default file system.  A URI whose
>>>>  scheme and authority determine the FileSystem implementation.  The
>>>>  uri's scheme determines the config property (fs.SCHEME.impl) naming
>>>>  the FileSystem implementation class.  The uri's authority is used to
>>>>  determine the host, port, etc. for a filesystem.</description>
>>>></property>
>>>>
>>>><property>
>>>>  <name>mapred.job.tracker</name>
>>>>  <value>master:54311</value>
>>>>  <description>The host and port that the MapReduce job tracker runs
>>>>  at.  If "local", then jobs are run in-process as a single map
>>>>  and reduce task.
>>>>  </description>
>>>></property>
>>>>
>>>><property>
>>>>  <name>dfs.replication</name>
>>>>  <value>2</value>
>>>>  <description>Default block replication.
>>>>  The actual number of replications can be specified when the file is
>>>>created.
>>>>  The default is used if replication is not specified in create time.
>>>>  </description>
>>>></property>
>>>>
>>>><property>
>>>>  <name>mapred.map.tasks</name>
>>>>  <value>20</value>
>>>>  <description>As a rule of thumb, use 10x the number of slaves (i.e.,
>>>>number of tasktrackers).
>>>>  </description>
>>>></property>
>>>>
>>>><property>
>>>>  <name>mapred.reduce.tasks</name>
>>>>  <value>4</value>
>>>>  <description>As a rule of thumb, use 2x the number of slave processors
>>>>(i.e., number of tasktrackers).
>>>>  </description>
>>>></property>
>>>></configuration>
>>>>
>>>>Please help me to reslove the same. Or else provide any other tutorial
>>>>for
>>>>multi node cluster setup.I am egarly waiting for the tutorials.
>>>>
>>>>
>>>>Thanks
>>>>
>>>>--
>>>>View this message in context:
>>>>http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14573889.html
>>>>Sent from the Hadoop Users mailing list archive at Nabble.com.
>>>>
>>>>
>>>
>>>
>> 
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Not-able-to-start-Data-Node-tp14573889p14577669.html
Sent from the Hadoop Users mailing list archive at Nabble.com.


Mime
View raw message