hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-528) Add dfsadmin -waitDatanodes feature to block until DNs have reported
Date Wed, 05 Aug 2009 07:52:15 GMT

    [ https://issues.apache.org/jira/browse/HDFS-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739364#action_12739364
] 

dhruba borthakur commented on HDFS-528:
---------------------------------------

Another generic approach is to specify the number of datanodes to wait for as  a percentage
of the total number of datanodes in a cluster. You would have to user the "includelist" feature
of HDFS to list all the known datanodes (which most admins probably do). In fact, the  NN
may exit safemode only if the specified percentage of datanodes have checked in with the NN.


Many times, when we restart our cluster,  many datanodes fail to join the NN. However, the
NN exists safemode because it finds at least one replica of every block. Then the NN starts
replicating blocks. We have to manually enter safemode, manually look at the datanodes that
have refuzed to join the NN, fix them and then exit safemode. Your proposed feature helps
in elegantly handling this scenario.

> Add dfsadmin -waitDatanodes feature to block until DNs have reported
> --------------------------------------------------------------------
>
>                 Key: HDFS-528
>                 URL: https://issues.apache.org/jira/browse/HDFS-528
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: scripts
>    Affects Versions: 0.21.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hdfs-528.txt
>
>
> When starting up a fresh cluster programatically, users often want to wait until DFS
is "writable" before continuing in a script. "dfsadmin -safemode wait" doesn't quite work
for this on a completely fresh cluster, since when there are 0 blocks on the system, 100%
of them are accounted for before any DNs have reported.
> This JIRA is to add a command which waits until a certain number of DNs have reported
as alive to the NN.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message