hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ajith S (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8693) refreshNamenodes does not support adding a new standby to a running DN
Date Tue, 04 Aug 2015 06:39:05 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653150#comment-14653150
] 

Ajith S commented on HDFS-8693:
-------------------------------

Hi [~john.jian.fang] and [~kihwal]

Agreed, need to fix refreshNameNodes. In refreshNNList, can we just add a new NN actor and
replace the old NN actor in block pool service.?? 
I would like to work on this issue :)

> refreshNamenodes does not support adding a new standby to a running DN
> ----------------------------------------------------------------------
>
>                 Key: HDFS-8693
>                 URL: https://issues.apache.org/jira/browse/HDFS-8693
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, ha
>    Affects Versions: 2.6.0
>            Reporter: Jian Fang
>            Priority: Critical
>
> I tried to run the following command on a Hadoop 2.6.0 cluster with HA support 
> $ hdfs dfsadmin -refreshNamenodes datanode-host:port
> to refresh name nodes on data nodes after I replaced one name node with a new one so
that I don't need to restart the data nodes. However, I got the following error:
> refreshNamenodes: HA does not currently support adding a new standby to a running DN.
Please do a rolling restart of DNs to reconfigure the list of NNs.
> I checked the 2.6.0 code and the error was thrown by the following code snippet, which
led me to this JIRA.
> void refreshNNList(ArrayList<InetSocketAddress> addrs) throws IOException {
> Set<InetSocketAddress> oldAddrs = Sets.newHashSet();
> for (BPServiceActor actor : bpServices)
> { oldAddrs.add(actor.getNNSocketAddress()); }
> Set<InetSocketAddress> newAddrs = Sets.newHashSet(addrs);
> if (!Sets.symmetricDifference(oldAddrs, newAddrs).isEmpty())
> { // Keep things simple for now -- we can implement this at a later date. throw new IOException(
"HA does not currently support adding a new standby to a running DN. " + "Please do a rolling
restart of DNs to reconfigure the list of NNs."); }
> }
> Looks like this the refreshNameNodes command is an uncompleted feature. 
> Unfortunately, the new name node on a replacement is critical for auto provisioning a
hadoop cluster with HDFS HA support. Without this support, the HA feature could not really
be used. I also observed that the new standby name node on the replacement instance could
stuck in safe mode because no data nodes check in with it. Even with a rolling restart, it
may take quite some time to restart all data nodes if we have a big cluster, for example,
with 4000 data nodes, let alone restarting DN is way too intrusive and it is not a preferable
operation in production. It also increases the chance for a double failure because the standby
name node is not really ready for a failover in the case that the current active name node
fails. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message