hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesus Camacho Rodriguez (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-13513) cleardanglingscratchdir does not work in some version of HDFS
Date Thu, 26 May 2016 07:37:12 GMT

    [ https://issues.apache.org/jira/browse/HIVE-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15301686#comment-15301686
] 

Jesus Camacho Rodriguez commented on HIVE-13513:
------------------------------------------------

[~daijy], could you push to branch-2.1 too? Master is version 2.2.0 now (I updated the fix
version accordingly). Thanks

> cleardanglingscratchdir does not work in some version of HDFS
> -------------------------------------------------------------
>
>                 Key: HIVE-13513
>                 URL: https://issues.apache.org/jira/browse/HIVE-13513
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 1.3.0, 2.2.0
>
>         Attachments: HIVE-13513.1.patch, HIVE-13513.2.patch
>
>
> On some Hadoop version, we keep getting "lease recovery" message at the time we check
for scratchdir by opening for appending:
> {code}
> Failed to APPEND_FILE xxx for DFSClient_NONMAPREDUCE_785768631_1 on 10.0.0.18 because
lease recovery is in progress. Try again later.
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2917)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2677)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:2984)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2953)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:655)
> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:421)
> 	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2137)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2133)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2131)
> {code}
> and
> {code}
> 16/04/14 04:51:56 ERROR hdfs.DFSClient: Failed to close inode 18963
> java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to
no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[10.0.0.12:30010,DS-b355ac2a-a23a-418a-af9b-4c1b4e26afe8,DISK]],
original=[DatanodeInfoWithStorage[10.0.0.12:30010,DS-b355ac2a-a23a-418a-af9b-4c1b4e26afe8,DISK]]).
The current failed datanode replacement policy is DEFAULT, and a client may configure this
via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:951)
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1017)
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1165)
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470)
> {code}
> The reason is not clear. However, if we remove hsync from SessionState, everything works
as expected. Attach patch to remove hsync call for now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message