hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9901) Move disk IO out of the heartbeat thread
Date Fri, 16 Dec 2016 23:18:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15755768#comment-15755768

Hadoop QA commented on HDFS-9901:

| (x) *{color:red}-1 overall{color}* |
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  0s{color} | {color:blue}
Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} | {color:red}
HDFS-9901 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute
for help. {color} |
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-9901 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12795434/0005-HDFS-9901-Move-diskIO-out-of-heartbeat-thread.patch
| Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/17879/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |

This message was automatically generated.

> Move disk IO out of the heartbeat thread
> ----------------------------------------
>                 Key: HDFS-9901
>                 URL: https://issues.apache.org/jira/browse/HDFS-9901
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Hua Liu
>            Assignee: Hua Liu
>         Attachments: 0001-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch,
0002-HDFS-9901-Move-block-validation-out-of-the-heartbeat.patch, 0003-HDFS-9901-Move-disk-IO-out-of-the-heartbeat-thread.patch,
0004-HDFS-9901-move-diskIO-out-of-the-heartbeat-thread.patch, 0005-HDFS-9901-Move-diskIO-out-of-heartbeat-thread.patch
> During heavy disk IO, we noticed hearbeat thread hangs on checkBlock method, which checks
the existence and length of a block before spins off a thread to do the actual transferring.
In extreme cases, the heartbeat thread hang more than 10 minutes so the namenode marked the
datanode as dead and started replicating its blocks, which caused more disk IO on other nodes
and can potentially brought them down.
> The patch contains two changes:
> 1. Makes DF asynchronous when monitoring the disk by creating a thread that checks the
disk and updates the disk status periodically. When the heartbeat threads generates storage
report, it then reads disk usage information from memory so that the heartbeat thread won't
get blocked during heavy diskIO. 
> 2. Makes the checks (which required disk accesses) in transferBlock() in DataNode into
a separate thread so the heartbeat does not have to wait for this when heartbeating.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message