hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Klosterman <nklos...@ecn.purdue.edu>
Subject Re: Ubuntu Single Node Tutorial failure. No live or dead nodes.
Date Wed, 10 Feb 2010 22:19:36 GMT
@E.Sammer, no I don't *think* that it is part of another cluster. The 
tutorial is for a single node cluster just as a initial set up to see if 
you can get things up and running.  I have reformatted the namenode 
several times in my effort to get hadoop to work.

@abishek
I tried the workaround you pointed me to no avail.  I tried to modify 
those directions since in the single node implementation I didn't have a 
dfs.data.dir in hdfs-site.xml

My attempts at further debug
-----------------------------------------ATTEMPT AT FIXING THE DATANODES 
PROBLEM

hadoop@potr134pc26:/usr/local/hadoop/bin$ rm -r 
/usr/local/hadoop-datastore/
----NOW THERE IS NO HADOOP-DATASTORE FOLDER LOCALLY
hadoop@potr134pc26:/usr/local/hadoop/bin$ ./hadoop namenode -format
10/02/10 16:33:50 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = potr134pc26/127.0.0.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.1
STARTUP_MSG:   build = 
http://svn.apache.org/repos/asf/hadoop/common/tags/release-0.20.1-rc1 -r 
810220; compiled by 'oom' on Tue Sep  1 20:55:56 UTC 2009
************************************************************/
Re-format filesystem in 
/home/hadoop/hadoop-datastore/hadoop-hadoop/dfs/name ? (Y or N) Y
10/02/10 16:33:54 INFO namenode.FSNamesystem: fsOwner=hadoop,hadoop
10/02/10 16:33:54 INFO namenode.FSNamesystem: supergroup=supergroup
10/02/10 16:33:54 INFO namenode.FSNamesystem: isPermissionEnabled=true
10/02/10 16:33:54 INFO common.Storage: Image file of size 96 saved in 0 
seconds.
10/02/10 16:33:54 INFO common.Storage: Storage directory 
/home/hadoop/hadoop-datastore/hadoop-hadoop/dfs/name has been successfully 
formatted.
10/02/10 16:33:54 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at potr134pc26/127.0.0.1
************************************************************/
hadoop@potr134pc26:/usr/local/hadoop/bin$ ./start-all.sh
starting namenode, logging to 
/usr/local/hadoop/bin/../logs/hadoop-hadoop-namenode-potr134pc26.out
localhost: starting datanode, logging to 
/usr/local/hadoop/bin/../logs/hadoop-hadoop-datanode-potr134pc26.out
localhost: starting secondarynamenode, logging to 
/usr/local/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-potr134pc26.out
starting jobtracker, logging to 
/usr/local/hadoop/bin/../logs/hadoop-hadoop-jobtracker-potr134pc26.out
localhost: starting tasktracker, logging to 
/usr/local/hadoop/bin/../logs/hadoop-hadoop-tasktracker-potr134pc26.out

hadoop@potr134pc26:/usr/local/hadoop/bin$ jps
27461 Jps
27354 TaskTracker
27158 SecondaryNameNode
27250 JobTracker
26923 NameNode
hadoop@potr134pc26:/usr/local/hadoop/bin$ ./hadoop dfsadmin -report
Configured Capacity: 0 (0 KB)
Present Capacity: 0 (0 KB)
DFS Remaining: 0 (0 KB)
DFS Used: 0 (0 KB)
DFS Used%: %
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 0 (0 total, 0 dead)

---(((( AT THIS POINT WHEN I CHECKED THE LOG THE DATANODES STILL WASN'T UP 
AND RUNNING)----------
mkdir /usr/local/hadoop-datastore
hadoop@potr134pc26:/usr/local/hadoop/bin$ ./stop-all.sh
stopping jobtracker
localhost: stopping tasktracker
stopping namenode
localhost: no datanode to stop
localhost: stopping secondarynamenode
hadoop@potr134pc26:/usr/local/hadoop/bin$ ./start-all.sh
starting namenode, logging to 
/usr/local/hadoop/bin/../logs/hadoop-hadoop-namenode-potr134pc26.out
localhost: starting datanode, logging to 
/usr/local/hadoop/bin/../logs/hadoop-hadoop-datanode-potr134pc26.out
localhost: starting secondarynamenode, logging to 
/usr/local/hadoop/bin/../logs/hadoop-hadoop-secondarynamenode-potr134pc26.out
starting jobtracker, logging to 
/usr/local/hadoop/bin/../logs/hadoop-hadoop-jobtracker-potr134pc26.out
localhost: starting tasktracker, logging to 
/usr/local/hadoop/bin/../logs/hadoop-hadoop-tasktracker-potr134pc26.out
hadoop@potr134pc26:/usr/local/hadoop/bin$ jps
28038 NameNode
28536 Jps
28154 DataNode
28365 JobTracker
28470 TaskTracker
28272 SecondaryNameNode

./hadoop dfs -copyFromLocal /home/hadoop/Desktop/*.txt txtinput
copyFromLocal: `txtinput': specified destination directory doest not exist
hadoop@potr134pc26:/usr/local/hadoop/bin$ ./hadoop dfs -mkdir txtinput
hadoop@potr134pc26:/usr/local/hadoop/bin$ ./hadoop dfs -copyFromLocal 
/home/hadoop/Desktop/*.txt txtinput
10/02/10 16:44:36 WARN hdfs.DFSClient: DataStreamer Exception: 
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
/user/hadoop/txtinput/20417.txt could only be replicated to 0 nodes, 
instead of 1
 	at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)
 	at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 	at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 	at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 	at java.lang.reflect.Method.invoke(Method.java:597)
 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
 	at java.security.AccessController.doPrivileged(Native Method)
 	at javax.security.auth.Subject.doAs(Subject.java:396)
 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

 	at org.apache.hadoop.ipc.Client.call(Client.java:739)
 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
 	at $Proxy0.addBlock(Unknown Source)
 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 	at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 	at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 	at java.lang.reflect.Method.invoke(Method.java:597)
 	at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
 	at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
 	at $Proxy0.addBlock(Unknown Source)
 	at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2904)
 	at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2786)
 	at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)
 	at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)

10/02/10 16:44:36 WARN hdfs.DFSClient: Error Recovery for block null bad 
datanode[0] nodes == null
10/02/10 16:44:36 WARN hdfs.DFSClient: Could not get block locations. 
Source file "/user/hadoop/txtinput/20417.txt" - Aborting...
10/02/10 16:44:36 WARN hdfs.DFSClient: DataStreamer Exception: 
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
/user/hadoop/txtinput/7ldvc10.txt could only be replicated to 0 nodes, 
instead of 1
 	at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)
 	at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 	at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 	at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 	at java.lang.reflect.Method.invoke(Method.java:597)
 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
 	at java.security.AccessController.doPrivileged(Native Method)
 	at javax.security.auth.Subject.doAs(Subject.java:396)
 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

 	at org.apache.hadoop.ipc.Client.call(Client.java:739)
 	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
 	at $Proxy0.addBlock(Unknown Source)
 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 	at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 	at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 	at java.lang.reflect.Method.invoke(Method.java:597)
 	at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
 	at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
 	at $Proxy0.addBlock(Unknown Source)
 	at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2904)
 	at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2786)
 	at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2076)
 	at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2262)

10/02/10 16:44:36 WARN hdfs.DFSClient: Error Recovery for block null bad 
datanode[0] nodes == null
10/02/10 16:44:36 WARN hdfs.DFSClient: Could not get block locations. 
Source file "/user/hadoop/txtinput/7ldvc10.txt" - Aborting...
copyFromLocal: java.io.IOException: File /user/hadoop/txtinput/20417.txt 
could only be replicated to 0 nodes, instead of 1
 	at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)
 	at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 	at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 	at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 	at java.lang.reflect.Method.invoke(Method.java:597)
 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
 	at java.security.AccessController.doPrivileged(Native Method)
 	at javax.security.auth.Subject.doAs(Subject.java:396)
 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

java.io.IOException: File /user/hadoop/txtinput/7ldvc10.txt could only be 
replicated to 0 nodes, instead of 1
 	at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)
 	at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 	at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 	at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 	at java.lang.reflect.Method.invoke(Method.java:597)
 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
 	at java.security.AccessController.doPrivileged(Native Method)
 	at javax.security.auth.Subject.doAs(Subject.java:396)
 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)


hadoop@potr134pc26:/usr/local/hadoop/bin$ ./hadoop dfsadmin -report
Configured Capacity: 0 (0 KB)
Present Capacity: 0 (0 KB)
DFS Remaining: 0 (0 KB)
DFS Used: 0 (0 KB)
DFS Used%: %
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 0 (0 total, 0 dead)

On Wed, 10 Feb 2010, E. Sammer wrote:

> On 2/10/10 3:57 PM, Nick Klosterman wrote:
>> It appears I have incompatible namespaceIDs. Any thoughts on how to
>> resolve that?
>> This is what the full datanodes log is saying:
>
> Was this data node part of a another DFS cluster at some point? It looks like 
> you've reformatted the name node since the datanode connected to it. The 
> datanode will refuse to connect to a namenode with a different namespaceId 
> because the the data node would have blocks (possibly with the same ids) from 
> another cluster. It's a stop gap safety mechanism. You'd have to destroy the 
> data directory on the data node to "reinitialize" it so it picks up the new 
> namespaceId from the name node at which point it will be allowed to connect.
>
> Just to be clear, this will also kill all data that was stored on the data 
> node, so don't do this lightly.
>
> HTH.
> -- 
> Eric Sammer
> eric@lifeless.net
> http://esammer.blogspot.com
>

Mime
View raw message