hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Tretyakov <itretya...@griddynamics.com>
Subject Re: could only be replicated to 0 nodes instead of minReplication
Date Thu, 10 Jan 2013 13:33:23 GMT
I also found following exception in datanode, I suppose it might give some
clue:

2013-01-10 11:37:55,397 ERROR
org.apache.hadoop.hdfs.server.datanode.DataNode:
node02.303net.pvt:50010:DataXceiver error processing READ_BLOCK operation
 src: /192.168.1.112:35991 dest: /192.168.1.112:50010
java.net.SocketTimeoutException: 480000 millis timeout while waiting for
channel to be ready for write. ch :
java.nio.channels.SocketChannel[connected local=/192.168.1.112:50010remote=/
192.168.1.112:35991]
        at
org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
        at
org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
        at
org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
        at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:492)
        at
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:655)
        at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:280)
        at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:88)
        at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:63)
        at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:219)
        at java.lang.Thread.run(Thread.java:662)


On Thu, Jan 10, 2013 at 4:04 PM, Ivan Tretyakov <itretyakov@griddynamics.com
> wrote:

> Hello!
>
> On our cluster jobs fails with the following exception:
>
> 2013-01-10 10:34:05,648 WARN org.apache.hadoop.hdfs.DFSClient:
> DataStreamer Exception
> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
> /user/persona/usersAggregate_20130110_15/_temporary/_attempt_201212271414_0458_m_000001_1/s/375ee510bbf44815b151df556e06b5ca
> could only be replicated to 0 nodes instead of minReplication (=1).  There
> are 6 datanode(s) running and no node(s) are excluded in this operation.
>         at
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
>         at
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
>
>         at org.apache.hadoop.ipc.Client.call(Client.java:1160)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>         at $Proxy10.addBlock(Unknown Source)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>         at $Proxy10.addBlock(Unknown Source)
>         at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:290)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1150)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1003)
>         at
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)
>
> I've found that it could be cause by lack of free disk space, but as I
> could see there is everything well (see attached dfs report output).
> Also, I could see following exception in TaskTracker log
> https://issues.apache.org/jira/browse/MAPREDUCE-5 but I'm not sure if it
> is related.
>
> Could it be related with another issue on our cluster? -
> http://mail-archives.apache.org/mod_mbox/hadoop-user/201301.mbox/%3CCAEAKFL90ReOWEvY_vuSMqU2GwMOAh0fndA9b-uodXZ6BYvz2Kg%40mail.gmail.com%3E
>
> Thanks in advance!
>
> --
> Best Regards
> Ivan Tretyakov
>



-- 
Best Regards
Ivan Tretyakov

Deployment Engineer
Grid Dynamics
+7 812 640 38 76
Skype: ivan.tretyakov
www.griddynamics.com
itretyakov@griddynamics.com

Mime
View raw message