hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 邓飞 (JIRA) <j...@apache.org>
Subject [jira] [Commented] (HDFS-9513) DataNodeManager#getDataNodeStorageInfos not backward compatibility
Date Mon, 07 Dec 2015 10:28:11 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15044717#comment-15044717
] 

邓飞 commented on HDFS-9513:
--------------------------

  usually,can get the storageInfo by block,but in the case,it's seems to not work.
  the DatanodeManager#getDatanodeStorageInfos called by 
    1. FSNameSystem#commitBlockSynchronization
    2. FSNameSystem#getAdditionalDataNode
    3. FSNameSystem#updatePipelingInternal
  the first method is called from datanodes,it's compatibility. 
  when add a new block,the block with storageInfo will store at inodefile/blockMap/editLog,but
if add new datanode for the pipeline,the addtional datanode and storageInfo not store,because
the client need to try transfer the RWB block.
   so when call FSNameSystem#updatePipelingInternal,the old block's storageInfo is not enough,so
can't recovery the storageInfo anyway.
   and if client is older than 2.3.0,the storageInfo is not useful,although the NN choose
 locatedBlock with storageInfo ,but client wirte to DN not pass the storageInfoId

  

> DataNodeManager#getDataNodeStorageInfos not backward compatibility
> ------------------------------------------------------------------
>
>                 Key: HDFS-9513
>                 URL: https://issues.apache.org/jira/browse/HDFS-9513
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client, namenode
>    Affects Versions: 2.2.0, 2.7.1
>         Environment:  2.2.0 HDFS Client &2.7.1 HDFS Cluster
>            Reporter: 邓飞
>            Assignee: 邓飞
>            Priority: Blocker
>
> We is upgraded our new HDFS cluster to 2.7.1,but we YARN cluster is 2.2.0(8000+,it's
too hard to upgrade as soon as HDFS cluster).
> The compatible case happened  datasteamer do pipeline recovery, the NN need DN's storageInfo
to update pipeline, and the storageIds is pair of pipleline's DN,but HDFS support storage
type feature from 2.3.0 [HDFS-2832|https://issues.apache.org/jira/browse/HDFS-2832], older
version not have storageId ,although the protobuf serialization make the protocol compatible,but
the client  will throw remote exception as ArrayIndexOutOfBoundsException.
> ----
> the exception stack is below:
> {noformat}
> 2015-12-05 20:26:38,291 ERROR [Thread-4] org.apache.hadoop.hdfs.DFSClient: Failed to
close file XXX
> org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException): 0
> 	at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.getDatanodeStorageInfos(DatanodeManager.java:513)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipelineInternal(FSNamesystem.java:6439)
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updatePipeline(FSNamesystem.java:6404)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updatePipeline(NameNodeRpcServer.java:892)
> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updatePipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:997)
> 	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1066)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1347)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1300)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> 	at com.sun.proxy.$Proxy10.updatePipeline(Unknown Source)
> 	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.updatePipeline(ClientNamenodeProtocolTranslatorPB.java:801)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> 	at com.sun.proxy.$Proxy11.updatePipeline(Unknown Source)
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1047)
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:823)
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:475)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message