hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yuzhang Han <yuzhanghan1...@gmail.com>
Subject "could only be replicated to 0 nodes instead of minReplication" exception during job execution
Date Mon, 24 Jun 2013 22:01:06 GMT
Hello,

I am using YARN. I get some exceptions at my namenode and datanode. They
are thrown when my Reduce progress gets 67%. Then, reduce phase is
restarted from 0% several times, but always restarts at this point. Can
someone tell me what I should do? Many thanks!


Namenode log:

2013-06-24 19:08:50,345 INFO BlockStateChange: BLOCK* addStoredBlock:
blockMap updated: 10.224.2.190:50010 is added to
blk_654446797771285606_5062{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[10.224.2.190:50010|RBW]]} size 0
2013-06-24 19:08:50,349 WARN
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy:
Not able to place enough replicas, still in need of 1 to reach 1
For more information, please enable DEBUG log level on
org.apache.commons.logging.impl.Log4JLogger
2013-06-24 19:08:50,350 ERROR
org.apache.hadoop.security.UserGroupInformation:
PriviledgedActionException as:ubuntu (auth:SIMPLE)
cause:java.io.IOException: File
/output/_temporary/1/_temporary/attempt_1372090853102_0001_r_000002_0/part-00002
could only be replicated to 0 nodes instead of minReplication (=1).
There are 2 datanode(s) running and no node(s) are excluded in this
operation.
2013-06-24 19:08:50,353 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 1 on 9000, call
org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from
10.224.2.190:49375: error: java.io.IOException: File
/output/_temporary/1/_temporary/attempt_1372090853102_0001_r_000002_0/part-00002
could only be replicated to 0 nodes instead of minReplication (=1).
There are 2 datanode(s) running and no node(s) are excluded in this
operation.
java.io.IOException: File
/output/_temporary/1/_temporary/attempt_1372090853102_0001_r_000002_0/part-00002
could only be replicated to 0 nodes instead of minReplication (=1).
There are 2 datanode(s) running and no node(s) are excluded in this
operation.
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1339)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2155)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:491)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:351)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40744)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:416)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
2013-06-24 19:08:50,413 INFO BlockStateChange: BLOCK* addStoredBlock:
blockMap updated: 10.224.2.190:50010 is added to
blk_8924314838535676494_5063{blockUCState=UNDER_CONSTRUCTION,
primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[10.224.2.190:50010|RBW]]} size 0
2013-06-24 19:08:50,418 WARN
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy:
Not able to place enough replicas, still in need of 1 to reach 1
For more information, please enable DEBUG log level on
org.apache.commons.logging.impl.Log4JLogger



Datanode log:

2013-06-24 19:25:54,695 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
BP-1724882733-10.10.79.145-1372090400593:blk_-2417373821601940925_6022,
type=LAST_IN_PIPELINE, downstreams=0:[] terminating
2013-06-24 19:25:54,699 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving
BP-1724882733-10.10.79.145-1372090400593:blk_3177955398059619584_6033 src: /
10.35.99.108:59710 dest: /10.35.99.108:50010
2013-06-24 19:25:56,473 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for
BP-1724882733-10.10.79.145-1372090400593:blk_8751401862589207807_6026
java.io.IOException: Connection reset by peer
    at sun.nio.ch.FileDispatcher.read0(Native Method)
    at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251)
    at sun.nio.ch.IOUtil.read(IOUtil.java:224)
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254)
    at
org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
    at
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
    at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:159)
    at
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:129)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at java.io.DataInputStream.read(DataInputStream.java:149)
    at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
    at
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
    at
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:171)
    at
org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
    at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:414)
    at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:644)
    at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:506)
    at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
    at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
    at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:219)
    at java.lang.Thread.run(Thread.java:679)
2013-06-24 19:25:56,476 INFO
org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
BP-1724882733-10.10.79.145-1372090400593:blk_8751401862589207807_6026,
type=LAST_IN_PIPELINE, downstreams=0:[]: Thread is interrupted.

Mime
View raw message