hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravi Prakash <ravi...@ymail.com>
Subject Re: "could only be replicated to 0 nodes instead of minReplication" exception during job execution
Date Tue, 25 Jun 2013 05:59:32 GMT
I wonder if this is another instance of https://issues.apache.org/jira/browse/HDFS-1172 . I
had noticed this behavior when I run Slive with extremely small file sizes. 




________________________________
 From: Yuzhang Han <yuzhanghan1982@gmail.com>
To: user@hadoop.apache.org 
Sent: Monday, June 24, 2013 3:01 PM
Subject: "could only be replicated to 0 nodes instead of minReplication" exception during
job execution
 


Hello,


I am using YARN. I get some exceptions at my namenode and datanode. They are thrown when my
Reduce progress gets 67%. Then, reduce phase is restarted from 0% several times, but always
restarts at this point. Can someone tell me what I should do? Many thanks!


Namenode log:


2013-06-24 19:08:50,345 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.224.2.190:50010
is added to blk_654446797771285606_5062{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[10.224.2.190:50010|RBW]]} size 0
2013-06-24 19:08:50,349 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy:
Not able to place enough replicas, still in need of 1 to reach 1
For more information, please enable DEBUG log level on org.apache.commons.logging.impl.Log4JLogger
2013-06-24 19:08:50,350 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:ubuntu (auth:SIMPLE) cause:java.io.IOException: File /output/_temporary/1/_temporary/attempt_1372090853102_0001_r_000002_0/part-00002
could only be replicated to 0 nodes instead of minReplication (=1).  There are 2 datanode(s)
running and no node(s) are excluded in this operation.
2013-06-24 19:08:50,353 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 9000, call
org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 10.224.2.190:49375: error: java.io.IOException:
File /output/_temporary/1/_temporary/attempt_1372090853102_0001_r_000002_0/part-00002 could
only be replicated to 0 nodes instead of minReplication (=1).  There are 2 datanode(s) running
and no node(s) are excluded in this operation.
java.io.IOException: File /output/_temporary/1/_temporary/attempt_1372090853102_0001_r_000002_0/part-00002
could only be replicated to 0 nodes instead of minReplication (=1).  There are 2 datanode(s)
running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1339)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2155)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:491)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:351)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:40744)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
at
 org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1741)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1737) at java.security.AccessController.doPrivileged(Native
Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1735)
2013-06-24 19:08:50,413 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.224.2.190:50010
is added to blk_8924314838535676494_5063{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1,
replicas=[ReplicaUnderConstruction[10.224.2.190:50010|RBW]]} size 0
2013-06-24 19:08:50,418 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy:
Not able to place enough replicas, still in need of 1 to reach 1
For more information, please enable DEBUG log level on org.apache.commons.logging.impl.Log4JLogger


Datanode log:

2013-06-24 19:25:54,695 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
BP-1724882733-10.10.79.145-1372090400593:blk_-2417373821601940925_6022, type=LAST_IN_PIPELINE,
downstreams=0:[] terminating
2013-06-24 19:25:54,699 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-1724882733-10.10.79.145-1372090400593:blk_3177955398059619584_6033
src: /10.35.99.108:59710 dest: /10.35.99.108:50010
2013-06-24 19:25:56,473 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for
BP-1724882733-10.10.79.145-1372090400593:blk_8751401862589207807_6026
java.io.IOException: Connection reset by peer
    at sun.nio.ch.FileDispatcher.read0(Native Method)
    at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
    at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251)
    at sun.nio.ch.IOUtil.read(IOUtil.java:224)
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254)
    at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
    at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:159)
    at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:129)
    at java.io.FilterInputStream.read(FilterInputStream.java:133)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at java.io.DataInputStream.read(DataInputStream.java:149)
    at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
    at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
    at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:171)
    at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:414)
    at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:644)
    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:506)
    at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
    at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
    at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:219)
    at java.lang.Thread.run(Thread.java:679)
2013-06-24 19:25:56,476 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder:
BP-1724882733-10.10.79.145-1372090400593:blk_8751401862589207807_6026, type=LAST_IN_PIPELINE,
downstreams=0:[]: Thread is interrupted.
Mime
View raw message