hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Molina <rmol...@hortonworks.com>
Subject Re: could only be replicated to 0 nodes instead of minReplication
Date Thu, 10 Jan 2013 17:17:30 GMT
Hi Ivan,
Here are a couple of more suggestions provided by the wiki:

http://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo

Regards,
Robert

On Thu, Jan 10, 2013 at 5:33 AM, Ivan Tretyakov <itretyakov@griddynamics.com
> wrote:

> I also found following exception in datanode, I suppose it might give some
> clue:
>
> 2013-01-10 11:37:55,397 ERROR
> org.apache.hadoop.hdfs.server.datanode.DataNode:
> node02.303net.pvt:50010:DataXceiver error processing READ_BLOCK operation
>  src: /192.168.1.112:35991 dest: /192.168.1.112:50010
> java.net.SocketTimeoutException: 480000 millis timeout while waiting for
> channel to be ready for write. ch :
> java.nio.channels.SocketChannel[connected local=/192.168.1.112:50010remote=/
> 192.168.1.112:35991]
>         at
> org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
>         at
> org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:166)
>         at
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:214)
>         at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:492)
>         at
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:655)
>         at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:280)
>         at
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:88)
>         at
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:63)
>         at
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:219)
>         at java.lang.Thread.run(Thread.java:662)
>
>
> On Thu, Jan 10, 2013 at 4:04 PM, Ivan Tretyakov <
> itretyakov@griddynamics.com> wrote:
>
>> Hello!
>>
>> On our cluster jobs fails with the following exception:
>>
>> 2013-01-10 10:34:05,648 WARN org.apache.hadoop.hdfs.DFSClient:
>> DataStreamer Exception
>> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
>> /user/persona/usersAggregate_20130110_15/_temporary/_attempt_201212271414_0458_m_000001_1/s/375ee510bbf44815b151df556e06b5ca
>> could only be replicated to 0 nodes instead of minReplication (=1).  There
>> are 6 datanode(s) running and no node(s) are excluded in this operation.
>>         at
>> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
>>         at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)
>>         at
>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)
>>         at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)
>>
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1160)
>>         at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
>>         at $Proxy10.addBlock(Unknown Source)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>         at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>>         at
>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>>         at $Proxy10.addBlock(Unknown Source)
>>         at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:290)
>>         at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1150)
>>         at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1003)
>>         at
>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463)
>>
>> I've found that it could be cause by lack of free disk space, but as I
>> could see there is everything well (see attached dfs report output).
>> Also, I could see following exception in TaskTracker log
>> https://issues.apache.org/jira/browse/MAPREDUCE-5 but I'm not sure if it
>> is related.
>>
>> Could it be related with another issue on our cluster? -
>> http://mail-archives.apache.org/mod_mbox/hadoop-user/201301.mbox/%3CCAEAKFL90ReOWEvY_vuSMqU2GwMOAh0fndA9b-uodXZ6BYvz2Kg%40mail.gmail.com%3E
>>
>> Thanks in advance!
>>
>> --
>> Best Regards
>> Ivan Tretyakov
>>
>
>
>
> --
> Best Regards
> Ivan Tretyakov
>
> Deployment Engineer
> Grid Dynamics
> +7 812 640 38 76
> Skype: ivan.tretyakov
> www.griddynamics.com
> itretyakov@griddynamics.com
>

Mime
View raw message