hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11090) Leave safemode immediately if all blocks have reported in
Date Tue, 15 Nov 2016 01:02:38 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15665601#comment-15665601
] 

Andrew Wang commented on HDFS-11090:
------------------------------------

Thanks for the comment Konst. If you look at Jing's comment again, random replication means
we'll have almost all FBRs reported in. Since this would also only skip the safemode extension,
it only triggers an additional 30s of replication, which isn't that massive. Finally, clusters
that want to avoid unnecessary replication can (should?) set the min datanode threshold to
an appropriate value.

The {{-D}} suggestion is good, but changing the startup script is functionally the same as
changing the config, so no better in terms of UX.

If this seems too complicated, for my usecase it'd also be sufficient to special case the
"empty cluster" case. This way it would not affect existing clusters.

> Leave safemode immediately if all blocks have reported in
> ---------------------------------------------------------
>
>                 Key: HDFS-11090
>                 URL: https://issues.apache.org/jira/browse/HDFS-11090
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.7.3
>            Reporter: Andrew Wang
>            Assignee: Yiqun Lin
>         Attachments: HDFS-11090.001.patch
>
>
> Startup safemode is triggered by two thresholds: % blocks reported in, and min # datanodes.
It's extended by an interval (default 30s) until these two thresholds are met.
> Safemode extension is helpful when the cluster has data, and the default % blocks threshold
(0.99) is used. It gives DNs a little extra time to report in and thus avoid unnecessary replication
work.
> However, we can leave startup safemode early if 100% of blocks have reported in.
> Note that operators sometimes change the % blocks threshold to > 1 to never automatically
leave safemode. We should maintain this behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message