hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Frank Lanitz <frank.lan...@sql-ag.de>
Subject Re: Time until a datanode is marked as dead
Date Mon, 26 Jan 2015 09:39:08 GMT
Hi,

Am 23.01.2015 um 19:23 schrieb Chris Nauroth:
> The time period for determining if a datanode is dead is calculated as a
> function of a few different configuration properties.  The current
> implementation in DatanodeManager.java does it like this:
> 
>     final long heartbeatIntervalSeconds = conf.getLong(
>         DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_KEY,
>         DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_DEFAULT);
>     final int heartbeatRecheckInterval = conf.getInt(
>         DFSConfigKeys.DFS_NAMENODE_HEARTBEAT_RECHECK_INTERVAL_KEY, 
>         DFSConfigKeys.DFS_NAMENODE_HEARTBEAT_RECHECK_INTERVAL_DEFAULT);
> // 5 minutes
>     this.heartbeatExpireInterval = 2 * heartbeatRecheckInterval
>         + 10 * 1000 * heartbeatIntervalSeconds;


Good to know.

> Under default configuration, dfs.namenode.heartbeat.recheck-interval is
> 5 minutes and dfs.heartbeat.interval is 3 seconds.  If we plug those
> values into the formula, we get 10.5 minutes, which agrees with your
> observation.  If you change dfs.namenode.heartbeat.recheck-interval to
> 2.5 minutes, then you'll achieve an effective timeout of 5.5 minutes
> before a datanode is marked dead.
> 
> dfs.namenode.heartbeat.recheck-interval is not documented in
> hdfs-default.xml, though I don't recall if that's an intentional choice
> or just an oversight.  The value of the property must be expressed in
> milliseconds.

This did the trick. Thank you very much. For testing porpuse I've set it
to 10000 and after approx 45s the node was marked as dead.

Any chance to get this into a documented preference so possible behavior
changes with future releases can be spotted before staging area.

cheers,
Frank

Mime
View raw message