hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Inigo Goiri (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9901) Move block validation out of the heartbeat thread
Date Thu, 10 Mar 2016 02:04:40 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15188501#comment-15188501
] 

Inigo Goiri commented on HDFS-9901:
-----------------------------------

The first version of the patch:
# Makes {{DF}} asynchronous when monitoring the disk by creating a thread that checks the
disk and updates the disk status periodically. Then the {{FsVolumeImpl}} reads the values
that are collected asynchronously.
# Makes the checks (which required disk accesses) in {{transferBlock()}} in {{DataNode}} into
a separate thread so the heartbeat does not have to wait for this when heartbeating.

> Move block validation out of the heartbeat thread
> -------------------------------------------------
>
>                 Key: HDFS-9901
>                 URL: https://issues.apache.org/jira/browse/HDFS-9901
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Hua Liu
>            Assignee: Hua Liu
>         Attachments: 0001-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch
>
>
> During heavy disk IO, we noticed hearbeat thread hangs on checkBlock method, which checks
the existence and length of a block before spins off a thread to do the actual transferring.
In extreme cases, the heartbeat thread hang more than 10 minutes so the namenode marked the
datanode as dead and started replicating its blocks, which caused more disk IO on other nodes
and can potentially brought them down.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message