hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10359) Allow trigger block report from all datanodes
Date Wed, 04 May 2016 02:41:13 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270042#comment-15270042

Arpit Agarwal commented on HDFS-10359:

Hi [~Tao Jie], processing full block reports is an expensive operation for the NameNode and
it gets more expensive as the cluster size and data grow. You will cause a denial of service
attack on your NameNode if you trigger full block reports every time you issue setrep. The
default block report interval is 6 hours for a good reason.

bq. however namenode would not notice block missing until block report in 6 hours. In this
case, we suppose to trigger block report for all datanodes before setrep -w. Further more,
if we want to set replication of blocks to 1, some blocks may corrupt.
You should never set the replication factor of a file to 1 unless you are okay with losing
the data or it can be trivially regenerated.

bq. It is OK to use a script to trigger block report from all datenodes, or just restart namenode.
Neither is necessary or recommended. You should trust the self-healing mechanisms of HDFS
to detect and deal with lost blocks and let go of the expectation that all blocks will have
exactly the expected number of replicas at all times. Under and over-replications are common
in any real cluster as disks fail, network links get congested, or nodes go away and come

> Allow trigger block report from all datanodes
> ---------------------------------------------
>                 Key: HDFS-10359
>                 URL: https://issues.apache.org/jira/browse/HDFS-10359
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.7.0, 2.6.1
>            Reporter: Tao Jie
> Since we have HDFS-7278 allows trigger block report from one certain datanode. It would
be helpful to add a option to this command to trigger block report from all datanodes.
> Command maybe like this:
> *hdfs dfsadmin -triggerBlockReport \[-incremental\] <datanode_host:ipc_port|all>*

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message