hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Song <chen.song...@gmail.com>
Subject Re: how to catch exception when data cannot be replication to any datanode
Date Mon, 02 Mar 2015 19:41:10 GMT
I am using CDH5.1.0, which is hadoop 2.3.0.

On Mon, Mar 2, 2015 at 12:23 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Which hadoop release are you using ?
>
> In branch-2, I see this IOE in BlockManager :
>
>     if (targets.length < minReplication) {
>       throw new IOException("File " + src + " could only be replicated to "
>           + targets.length + " nodes instead of minReplication (="
>           + minReplication + ").  There are "
>
> Cheers
>
> On Mon, Mar 2, 2015 at 8:44 AM, Chen Song <chen.song.82@gmail.com> wrote:
>
>> Hey
>>
>> I got the following error in the application logs when trying to put a
>> file to DFS.
>>
>> 015-02-27 19:42:01 DFSClient [ERROR] Failed to close inode 559475968
>> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/impbus.log_impbus_view.v001.2015022719.T07-431672015022719385410197.pb.pb
could only be replicated to 0 nodes instead of minReplication (=1).  There are 317 datanode(s)
running and no node(s) are excluded in this operation.
>>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1447)
>>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2703)
>>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:569)
>>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
>>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:415)
>>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980)
>>
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1409)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:1362)
>>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>>         at com.sun.proxy.$Proxy23.addBlock(Unknown Source)
>>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:362)
>>         at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>>         at com.sun.proxy.$Proxy24.addBlock(Unknown Source)
>>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1438)
>>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1260)
>>         at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525)
>>
>>
>> This results in empty file in HDFS. I did some search through this email
>> thread and found that this could be caused by disk full, or data node
>> unreachable.
>>
>> However, this exception was only logged as WARN level when
>> FileSystem.close is called, and never thrown visible to client. My question
>> is, on the client level, How can I catch this exception and handle it?
>>
>> Chen
>>
>> --
>> Chen Song
>>
>>
>


-- 
Chen Song

Mime
View raw message