hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lisheng Sun (Jira)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-14648) DeadNodeDetector basic model
Date Sun, 25 Aug 2019 16:02:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-14648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16915281#comment-16915281
] 

Lisheng Sun commented on HDFS-14648:
------------------------------------

Thank [~zhangchen] for good suggestion. I will upload the a design doc which describes this
patch in detail later.

> DeadNodeDetector basic model
> ----------------------------
>
>                 Key: HDFS-14648
>                 URL: https://issues.apache.org/jira/browse/HDFS-14648
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Lisheng Sun
>            Assignee: Lisheng Sun
>            Priority: Major
>         Attachments: HDFS-14648.001.patch, HDFS-14648.002.patch, HDFS-14648.003.patch,
HDFS-14648.004.patch
>
>
> This Jira constructs DeadNodeDetector state machine model. The function it implements
as follow:
>  # When a DFSInputstream is opened, a BlockReader is opened. If some DataNode of the
block is found to inaccessible, put the DataNode into DeadNodeDetector#deadnode.(HDFS-14649)
will optimize this part. Because when DataNode is not accessible, it is likely that the replica
has been removed from the DataNode.Therefore, it needs to be confirmed by re-probing and requires
a higher priority processing.
>  # DeadNodeDetector will periodically detect the Node in DeadNodeDetector#deadnode, If
the access is successful, the Node will be moved from DeadNodeDetector#deadnode. Continuous
detection of the dead node is necessary. The DataNode need rejoin the cluster due to a service
restart/machine repair. The DataNode may be permanently excluded if there is no added probe
mechanism.
>  # DeadNodeDetector#dfsInputStreamNodes Record the DFSInputstream using DataNode. When
the DFSInputstream is closed, it will be moved from DeadNodeDetector#dfsInputStreamNodes.
>  # Every time get the global deanode, update the DeadNodeDetector#deadnode. The new DeadNodeDetector#deadnode
Equals to the intersection of the old DeadNodeDetector#deadnode and the Datanodes are by DeadNodeDetector#dfsInputStreamNodes.
>  # DeadNodeDetector has a switch that is turned off by default. When it is closed, each
DFSInputstream still uses its own local deadnode.
>  # This feature has been used in the XIAOMI production environment for a long time. Reduced
hbase read stuck, due to node hangs.
>  # Just open the DeadNodeDetector switch and you can use it directly. No other restrictions.
Don't want to use DeadNodeDetector, just close it.
> {code:java}
> if (sharedDeadNodesEnabled && deadNodeDetector == null) {
>   deadNodeDetector = new DeadNodeDetector(name);
>   deadNodeDetectorThr = new Daemon(deadNodeDetector);
>   deadNodeDetectorThr.start();
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message