hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2231) Configuration changes for HA namenode
Date Tue, 16 Aug 2011 21:16:28 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085977#comment-13085977
] 

Suresh Srinivas commented on HDFS-2231:
---------------------------------------

I had used VIP address to mean failover address, which seems to have caused the confusion.
Here is the second part rewritten:

For discussion of existing configuration see the first part of - https://issues.apache.org/jira/browse/HDFS-2231?focusedCommentId=13080279&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13080279

h2. Configuration requirements for HA
*Terminology:*
# NNAddress1, NNAddress2 - address of individual NNs. They could be logical addresses.
# NNActiveAddress - Address where the active is running. This is one of NNAddress1 or NNAddress2.
# NNStandbyAddress - Where the standby is running. This is one of NNAddress1 or NNAddress2.
# NNFailoverAddress - this is the address of the active used by HA setups that use IP failover
mechanism.

*Requirements:*
# Backward compatibility: Existing deployments must be able to use the existing configuration
without any change.
# Datanodes and client need to know both the namenodes through configuration.
# As much as possible the configuration for all the nodes must be the same. The special configuration
required for different node types (namenode, datanodes, gateways) should be minimmal.

h3. HA solution uses IP failover
# System needs to be configured with three sets of addresses, NNFailoverAddress, NNAddress1
and NNAddress2.
# To get to the active namenode, clients use NNFailoverAddress.
# To discover NNStandbyAddress clients and datanode may use ZooKeeper or try NNAddress1 and
NNAddress2.

h3. Active and Standby namenode addresses without IP failover
# This setup does not require NNFailoverAddress.
# To discover NNActiveAddress and NNStandbyAddress clients and datanodes may try NNAddress1
and NNAddress2 or use Zookeeper.

h2. Proposal
h3. For solutions using IP Failover
# NNFailoverAddress related configuration goes into configuration (Set 1 above). I propose
using the existing keys: DFS_NAMENODE_RPC_ADDRESS_KEY, DFS_NAMENODE__SERVICE_RPC_ADDRESS_KEY,
DFS_NAMENODE_HTTP_ADDRESS_KEY, DFS_NAMENODE_HTTPS_ADDRESS_KEY

h3. Generic part common to both VIP and non VIP based solution:*
*How do we add both namenodes into a common configuration?*
Datanodes need to know both the namenode addresses.  I propse adding:
DFS_NAMENODE_IDS (dfs.namenode.ids) and comma separated list of ids (any appropriate string).
Add (Set 2) suffixed with "." + <NamenodeID>.
The client and datanodes can read DFS_NAMENODES and use the suffix to get corresponding parameters
to use.

*How does namenode know its NamenodeID and what configuration parameters to load?*
Namenode discovers its own configuration from parameter DFS_NAMENODE_ID (dfs.namenode.id).
On namenodes an xml include points to a file with a parameter DFS_NAMENODE_ID with corresponding
NamenodeID. On other nodes such as datanodes and client gateway machines the xml include points
an empty file. I like Todd's proposal, where a namenode when sees empty or unconfigured DFS_NAMENODE_ID,
could try binding to the rpc address and when it succeeds, it discovers its NamenodeID, from
suffix in the config param. (We could drop DFS_NAMENODE_ID altogether).

Example for deployments without IP failover:
NNAddress1 = host1:port
NNAddress2 = host2:port

{noformat}
<property>
<name>dfs.namenode.ids</name>
<value>nn1, nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nn1</name>
<value>host1:port</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nn2</name>
<value>host2:port</value>
</property>
{noformat}

Example for deployments with IP failover:
NNFailoverAddress = failoverAddress:port
NNAddress1 = host1:port
NNAddress2 = host2:port

{noformat}
<property>
<name>dfs.namenode.ids</name>
<value>nn1, nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address</name>
<value>failoverAddress:port</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nn1</name>
<value>host1:port</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nn2</name>
<value>host2:port</value>
</property>
{noformat}



> Configuration changes for HA namenode
> -------------------------------------
>
>                 Key: HDFS-2231
>                 URL: https://issues.apache.org/jira/browse/HDFS-2231
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>             Fix For: HA branch (HDFS-1623)
>
>
> This jira tracks the changes required for configuring HA setup for namenodes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message