hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9945) Datanode command for evicting writers
Date Fri, 11 Mar 2016 21:44:48 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15191533#comment-15191533

Kihwal Lee commented on HDFS-9945:

This patch adds a new dfsadmin command that talks to the specified datanode to stop all writers.
Since it is a new method and feature, the risk of breaking existing code is low.

The only notable change to the existing code is making {{replaceBlock()}} update the {{blockReceiver}}
class variable instead of its own local variable. We want {{evictWriters()}} to also stop
replications and balancer-related write activities. Because of this change, {{sendOOB()}}
was updated to not send if {{isDatanode}} is true. Prior to this change, {{sendOOB()}} was
not called at all when {{replaceBlock()}} is in progress, because {{blockReceiver}}, the class
variable, was still null.

> Datanode command for evicting writers
> -------------------------------------
>                 Key: HDFS-9945
>                 URL: https://issues.apache.org/jira/browse/HDFS-9945
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>         Attachments: HDFS-9945.patch
> It will be useful if there is a command to evict writers from a datanode. When a set
of datanodes are being decommissioned, they can get blocked by slow writers at the end.  It
was rare in the old days since mapred jobs didn't last too long, but with many different types
of apps running on today's YARN cluster, we are often see very long tail in datanode decommissioning.
> I propose a new dfsadmin command, {{evictWriters}}, to be added. I initially thought
about having namenode automatically telling datanodes on decommissioning, but realized that
having a command is more flexible. E.g. users can choose not to do this at all, choose when
to evict writers, or whether to try multiple times for whatever reasons.

This message was sent by Atlassian JIRA

View raw message