hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-9239) DataNode Lifeline Protocol: an alternative protocol for reporting DataNode liveness
Date Fri, 04 Mar 2016 23:51:41 GMT

     [ https://issues.apache.org/jira/browse/HDFS-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Chris Nauroth updated HDFS-9239:
    Release Note: This release adds a new feature called the DataNode Lifeline Protocol. 
If configured, then DataNodes can report that they are still alive to the NameNode via a fallback
protocol, separate from the existing heartbeat messages.  This can prevent the NameNode from
incorrectly marking DataNodes as stale or dead in highly overloaded clusters where heartbeat
processing is suffering delays.  For more information, please refer to the hdfs-default.xml
documentation for several new configuration properties: dfs.namenode.lifeline.rpc-address,
dfs.namenode.lifeline.rpc-bind-host, dfs.datanode.lifeline.interval.seconds, dfs.namenode.lifeline.handler.ratio
and dfs.namenode.lifeline.handler.count.

> DataNode Lifeline Protocol: an alternative protocol for reporting DataNode liveness
> -----------------------------------------------------------------------------------
>                 Key: HDFS-9239
>                 URL: https://issues.apache.org/jira/browse/HDFS-9239
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode, namenode
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>             Fix For: 2.8.0
>         Attachments: DataNode-Lifeline-Protocol.pdf, HDFS-9239.001.patch, HDFS-9239.002.patch,
> This issue proposes introduction of a new feature: the DataNode Lifeline Protocol.  This
is an RPC protocol that is responsible for reporting liveness and basic health information
about a DataNode to a NameNode.  Compared to the existing heartbeat messages, it is lightweight
and not prone to resource contention problems that can harm accurate tracking of DataNode
liveness currently.  The attached design document contains more details.

This message was sent by Atlassian JIRA

View raw message