hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Luya <alexander.l...@gmail.com>
Subject Does error "could only be replicated to 0 nodes, instead of 1 " mean no datanodes available?
Date Wed, 26 May 2010 10:53:56 GMT
Hello:
   I got this error when putting files into hdfs,it seems a old issue,and I 
followed the solution of this link:
----------------------------------------------------------------------------------------------------------------------------
http://adityadesai.wordpress.com/2009/02/26/another-problem-with-hadoop-
jobjar-could-only-be-replicated-to-0-nodes-instead-of-1io-exception/
-----------------------------------------------------------------------------------------------------------------------------

but problem still exists.so I tried to figure it out through source code:
-----------------------------------------------------------------------------------------------------------------------------------
 org.apache.hadoop.hdfs.server.namenode.FSNameSystem.getAdditionalBlock()
-----------------------------------------------------------------------------------------------------------------------------------
 // choose targets for the new block tobe allocated.
    DatanodeDescriptor targets[] = replicator.chooseTarget(replication,
                                                           clientNode,
                                                           null,
                                                           blockSize);
    if (targets.length < this.minReplication) {
      throw new IOException("File " + src + " could only be replicated to " +
                            targets.length + " nodes, instead of " +
                            minReplication);
--------------------------------------------------------------------------------------------------------------------------------------

I think "DatanodeDescriptor" represents datanode,so here "targets.length" 
means the number of datanode,clearly,it is 0,in other words,no datanode is 
available.But in the web interface:localhost:50070,I can see 4 live nodes(I 
have 4 nodes only),and "hadoop dfsadmin -report" shows 4 nodes also.that is 
strange.
	And I got this error message in secondary namenode:
---------------------------------------------------------------------------------------------------------------------------------
2010-05-26 16:26:39,588 INFO org.apache.hadoop.hdfs.server.common.Storage: 
Recovering storage directory /home/alex/tmp/dfs/namesecondary from failed 
checkpoint.
2010-05-26 16:26:39,593 ERROR 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
doCheckpoint: 
2010-05-26 16:26:39,594 ERROR 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: 
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:193)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
..................................
---------------------------------------------------------------------------------------------------------------------------------
and error message in datanode:
---------------------------------------------------------------------------------------------------------------------------------
2010-05-26 16:07:49,039 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(192.168.1.3:50010, 
storageID=DS-1180479012-192.168.1.3-50010-1274799233678, infoPort=50075, 
ipcPort=50020):DataXceiver
java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcher.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
        at sun.nio.ch.IOUtil.read(IOUtil.java:206)
.........................
---------------------------------------------------------------------------------------------------------------------------------

Seems like that network ports don't open,but after scaning by nmap,I can 
confirm that all network ports in relevant nodes are being opened.After two 
days effort,result is zero.

Can anybody help me troubleshooting?Thank you.



      (following is  relevant info:my cluster configuration,content conf files 
and oupt or "hadoop dfsadmin -report" and java error message stack )



-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
my configuration is:
-----------------------------------------------------------------------------------------
ubuntu 10.04 64 bit+jdk1.6.0_20+hadoop  0.20.2,
-----------------------------------------------------------------------------------------



core-site.xml
-----------------------------------------------------------------------------------------
<configuration>
<property>
    <name>fs.default.name</name>
    <value>hdfs://AlexLuya</value>
</property>
<property>
    <name>hadoop.tmp.dir</name>
    <value>/home/alex/tmp</value>

</property>
</configuration>

-----------------------------------------------------------------------------------------


hdfs-site.xml
-----------------------------------------------------------------------------------------
<configuration>
        <property>
                <name>dfs.replication</name>
                <value>3</value>
        </property>
        <property>
                <name>dfs.name.dir</name>
                <value>/home/alex/hadoop/namenode</value>
        </property>
        <property>
                <name>dfs.data.dir</name>
                <value>/home/alex/hadoop/dfs</value>
        </property>
        <property>
                <name>dfs.block.size</name>
                <value>134217728</value>
        </property>
        <property>
                <name>dfs.datanode.max.xcievers</name>
                <value>2047</value>
        </property>
</configuration>

-----------------------------------------------------------------------------------------
masters
-----------------------------------------------------------------------------------------
192.168.1.2
-----------------------------------------------------------------------------------------
slaves
-----------------------------------------------------------------------------------------
192.168.1.3
192.168.1.4
192.168.1.5
192.168.1.6

-----------------------------------------------------------------------------------------
result of hadoop dfsadmin -report
-----------------------------------------------------------------------------------------
Configured Capacity: 6836518912 (6.37 GB)
Present Capacity: 1406951424 (1.31 GB)
DFS Remaining: 1406853120 (1.31 GB)
DFS Used: 98304 (96 KB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 4 (4 total, 0 dead)

Name: 192.168.1.5:50010
Decommission Status : Normal
Configured Capacity: 1709129728 (1.59 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 1345765376 (1.25 GB)
DFS Remaining: 363339776(346.51 MB)
DFS Used%: 0%
DFS Remaining%: 21.26%
Last contact: Tue May 25 20:51:09 CST 2010


Name: 192.168.1.3:50010
Decommission Status : Normal
Configured Capacity: 1709129728 (1.59 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 1373503488 (1.28 GB)
DFS Remaining: 335601664(320.05 MB)
DFS Used%: 0%
DFS Remaining%: 19.64%
Last contact: Tue May 25 20:51:10 CST 2010


Name: 192.168.1.6:50010
Decommission Status : Normal
Configured Capacity: 1709129728 (1.59 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 1346879488 (1.25 GB)
DFS Remaining: 362225664(345.45 MB)
DFS Used%: 0%
DFS Remaining%: 21.19%
Last contact: Tue May 25 20:51:08 CST 2010


Name: 192.168.1.4:50010
Decommission Status : Normal
Configured Capacity: 1709129728 (1.59 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 1363419136 (1.27 GB)
DFS Remaining: 345686016(329.67 MB)
DFS Used%: 0%
DFS Remaining%: 20.23%
Last contact: Tue May 25 20:51:08 CST 2010

-----------------------------------------------------------------------------------------
Java error stack:
-----------------------------------------------------------------------------------------
10/05/25 20:43:24 WARN hdfs.DFSClient: DataStreamer Exception: 
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
/user/alex/input could only be replicated to 0 nodes, instead of 1
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

        at org.apache.hadoop.ipc.Client.call(Client.java:740)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy0.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy0.addBlock(Unknown Source)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)

10/05/25 20:43:24 WARN hdfs.DFSClient: Error Recovery for block null bad 
datanode[0] nodes == null
10/05/25 20:43:24 WARN hdfs.DFSClient: Could not get block locations. Source 
file "/user/alex/input" - Aborting...
put: java.io.IOException: File /user/alex/input could only be replicated to 0 
nodes, instead of 1
10/05/25 20:43:24 ERROR hdfs.DFSClient: Exception closing file /user/alex/input 
: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
/user/alex/input could only be replicated to 0 nodes, instead of 1
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
/user/alex/input could only be replicated to 0 nodes, instead of 1
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1271)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

        at org.apache.hadoop.ipc.Client.call(Client.java:740)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy0.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy0.addBlock(Unknown Source)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2937)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2819)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)

-----------------------------------------------------------------------------------------

Mime
View raw message