hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3703) Decrease the datanode failure detection time
Date Thu, 13 Sep 2012 16:56:09 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13455006#comment-13455006
] 

Suresh Srinivas commented on HDFS-3703:
---------------------------------------

bq. I think disabling the data node heart beat simulates GC pause, etc. Shutting down data
node and restarting would change its internal state. 

The changes here are mainly testing the state of the datanode as seen by the namenode. The
real state of datanode is immaterial.

In the trunk version of the patch, Jing chose to disable the heartbeat after marking it stale.
This ensures that the datanode marked stale would not send heartbeat and become live again
and cause test failure. Shutting down datanode will essentially do the same. A datanode marked
stale will not send heartbeat. Also the test finishes with in 10 mins of time by when datanode
will be marked dead. This accomplishes what test is doing in trunk with least amount of change.

That said, if you think it is straight forward to add capability to disable heartbeats at
the datanode, do go ahead.
                
> Decrease the datanode failure detection time
> --------------------------------------------
>
>                 Key: HDFS-3703
>                 URL: https://issues.apache.org/jira/browse/HDFS-3703
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node, name-node
>    Affects Versions: 1.0.3, 2.0.0-alpha, 3.0.0
>            Reporter: nkeywal
>            Assignee: Jing Zhao
>             Fix For: 3.0.0, 2.0.3-alpha
>
>         Attachments: 3703-hadoop-1.0.txt, HDFS-3703-branch2.patch, HDFS-3703.patch, HDFS-3703-trunk-read-only.patch,
HDFS-3703-trunk-read-only.patch, HDFS-3703-trunk-read-only.patch, HDFS-3703-trunk-read-only.patch,
HDFS-3703-trunk-read-only.patch, HDFS-3703-trunk-read-only.patch, HDFS-3703-trunk-read-only.patch,
HDFS-3703-trunk-with-write.patch
>
>
> By default, if a box dies, the datanode will be marked as dead by the namenode after
10:30 minutes. In the meantime, this datanode will still be proposed  by the nanenode to write
blocks or to read replicas. It happens as well if the datanode crashes: there is no shutdown
hooks to tell the nanemode we're not there anymore.
> It especially an issue with HBase. HBase regionserver timeout for production is often
30s. So with these configs, when a box dies HBase starts to recover after 30s and, while 10
minutes, the namenode will consider the blocks on the same box as available. Beyond the write
errors, this will trigger a lot of missed reads:
> - during the recovery, HBase needs to read the blocks used on the dead box (the ones
in the 'HBase Write-Ahead-Log')
> - after the recovery, reading these data blocks (the 'HBase region') will fail 33% of
the time with the default number of replica, slowering the data access, especially when the
errors are socket timeout (i.e. around 60s most of the time). 
> Globally, it would be ideal if HDFS settings could be under HBase settings. 
> As a side note, HBase relies on ZooKeeper to detect regionservers issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message