hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chen Song <chen.song...@gmail.com>
Subject how to catch exception when data cannot be replication to any datanode
Date Mon, 02 Mar 2015 16:44:46 GMT
Hey

I got the following error in the application logs when trying to put a file
to DFS.

015-02-27 19:42:01 DFSClient [ERROR] Failed to close inode 559475968
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
/tmp/impbus.log_impbus_view.v001.2015022719.T07-431672015022719385410197.pb.pb
could only be replicated to 0 nodes instead of minReplication (=1).
There are 317 datanode(s) running and no node(s) are excluded in this
operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1447)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2703)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:569)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1982)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1554)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1980)

        at org.apache.hadoop.ipc.Client.call(Client.java:1409)
        at org.apache.hadoop.ipc.Client.call(Client.java:1362)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
        at com.sun.proxy.$Proxy23.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:362)
        at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy24.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1438)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1260)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525)


This results in empty file in HDFS. I did some search through this email
thread and found that this could be caused by disk full, or data node
unreachable.

However, this exception was only logged as WARN level when FileSystem.close
is called, and never thrown visible to client. My question is, on the
client level, How can I catch this exception and handle it?

Chen

-- 
Chen Song

Mime
View raw message