hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2231) Configuration changes for HA namenode
Date Fri, 05 Aug 2011 23:02:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080279#comment-13080279
] 

Suresh Srinivas commented on HDFS-2231:
---------------------------------------

h2. Current namenode and related configuration:
h3. Configuration that does not change for HA
*BackupNode related:*
DFS_NAMENODE_BACKUP_ADDRESS_KEY, DFS_NAMENODE_BACKUP_HTTP_ADDRESS_KEY, DFS_NAMENODE_BACKUP_SERVICE_RPC_ADDRESS_KEY

*Secondary namenode related*
DFS_NAMENODE_SECONDARY_HTTP_ADDRESS_KEY, DFS_SECONDARY_NAMENODE_KEYTAB_FILE_KEY, DFS_SECONDARY_NAMENODE_USER_NAME_KEY,
DFS_SECONDARY_NAMENODE_KRB_HTTPS_USER_NAME_KEY

*Checkpointer related*
DFS_NAMENODE_CHECKPOINT_PERIOD_KEY, DFS_NAMENODE_CHECKPOINT_SIZE_KEY, DFS_NAMENODE_CHECKPOINT_DIR_KEY,
DFS_NAMENODE_CHECKPOINT_EDITS_DIR_KEY

h3. Common configuration for active and standby (Set 1)
DFS_NAMENODE_SAFEMODE_EXTENSION_KEY, DFS_NAMENODE_SAFEMODE_THRESHOLD_PCT_KEY, DFS_NAMENODE_HOSTS_KEY,
DFS_NAMENODE_HOSTS_EXCLUDE_KEY, DFS_NAMENODE_DECOMMISSION_INTERVAL_KEY, DFS_NAMENODE_DECOMMISSION_NODES_PER_INTERVAL_KEY,
DFS_NAMENODE_HANDLER_COUNT_KEY, DFS_NAMENODE_SERVICE_HANDLER_COUNT_KEY, DFS_NAMENODE_PLUGINS_KEY,
DFS_NAMENODE_STARTUP_KEY, DFS_NAMENODE_NAME_CACHE_THRESHOLD_KEY, DFS_NAMENODE_MAX_OBJECTS_KEY,
DFS_NAMENODE_UPGRADE_PERMISSION_KEY, DFS_NAMENODE_HEARTBEAT_RECHECK_INTERVAL_KEY, DFS_NAMENODE_ACCESSTIME_PRECISION_KEY,
DFS_NAMENODE_REPLICATION_CONSIDERLOAD_KEY, DFS_NAMENODE_REPLICATION_INTERVAL_KEY, DFS_NAMENODE_REPLICATION_MIN_KEY,
DFS_NAMENODE_REPLICATION_PENDING_TIMEOUT_SEC_KEY, DFS_NAMENODE_REPLICATION_MAX_STREAMS_KEY,
DFS_NAMENODE_DELEGATION_KEY_UPDATE_INTERVAL_KEY, DFS_NAMENODE_DELEGATION_TOKEN_RENEW_INTERVAL_KEY,
DFS_NAMENODE_DELEGATION_TOKEN_MAX_LIFETIME_KEY, DFS_NAMENODE_MAX_COMPONENT_LENGTH_KEY, DFS_NAMENODE_MAX_DIRECTORY_ITEMS_KEY,
DFS_NAMENODE_KEYTAB_FILE_KEY, DFS_NAMENODE_USER_NAME_KEY, DFS_NAMENODE_KRB_HTTPS_USER_NAME_KEY

h3. Configurtion that is different for active and standby (Set 2)
*Address configuration:*
FS_DEFAULT_NAME_KEY, DFS_NAMENODE_RPC_ADDRESS_KEY, DFS_NAMENODE_SERVICE_RPC_ADDRESS_KEY, DFS_NAMENODE_HTTP_ADDRESS_KEY,
DFS_NAMENODE_HTTPS_ADDRESS_KEY

*Storage directories* (external nfs storage dir could be different for active/standby)
DFS_NAMENODE_NAME_DIR_KEY, DFS_NAMENODE_EDITS_DIR_KEY

h3. Unused
DFS_NAMENODE_NAME_DIR_RESTORE_KEY // Will remove this in a separate patch

h2. Configuration requirements for HA
*Terminology:*
# NNAddress1, NNAddress2 - address of individual NNs
# NNActiveAddress - Where the active provides namenode service.
# NNStandbyAddress - Where the standby namenode service such as read-only operations.
# NNVIPAddress/FailoverAddress - this is the VIP address used by HA setup. This address is
owned by the active namenode and NNActiveAddress is same as NNVIPAddress.

*Requirements:*
# Existing deployments must be able to use the configuration without any change.
# Datanodes and client need to know either through configuration or mechanism such as ZooKeeper,
both the namenodes, that is, active and standby.
# As much as possible the configuration for all the nodes must be the same. The special configuration
required for different node types should be minimmal.

h3. HA solution uses VIP address
# System needs to be configured with three sets of addresses, NNVIPAddress, NNAddress1 and
NNAddress2.
# Clients and datanodes use NNVIPAddress as NNActiveAddress.
# To discover NNStandbyAddress clients and datanode may try NNAddress1 and NNAddress2 or use
mechanism such as Zookeeper.

h3. Active and Standby namenode addresses without VIP address
# This setup does not require NNVIPAddress.
# To discover NNActiveAddress and NNStandbyAddress clients and datanodes may try NNAddress1
and NNAddress2 or use mechanism such as Zookeeper.

h2. Proposal
h3. For VIP based solutions
# NNVIPAddress related configuration goes into configuration (Set 1 above). I propose using
the existing keys: DFS_NAMENODE_RPC_ADDRESS_KEY, DFS_NAMENODE__SERVICE_RPC_ADDRESS_KEY, DFS_NAMENODE_HTTP_ADDRESS_KEY,
DFS_NAMENODE_HTTPS_ADDRESS_KEY - Try using existing params and add new params for nn1 and
nn2
# The active namenode uses this information to start services at appropriate addresses.

h3. Generic part common to both VIP and non VIP based solution:*
*How do we add both namenodes into a common configuration?*
Datanodes need to know both the namenode addresses. Doing it in a single config file enables
this. 

To do this, I propse adding:
DFS_NAMENODE_IDS (dfs.namenode.ids) and comma separated list of ids (any appropriate string).
Add (Set 2) suffixed with "." + <NamenodeID>.
The client and datanodes can read DFS_NAMENODES and use the suffix to get corresponding parameters
to load.

*How does namenode know its NamenodeID and what configuration parameters to load?*
Namenode discovers its own configuration from parameter DFS_NAMENODE_ID (dfs.namenode.id).
On namenodes an xml include points to a file with a parameter DFS_NAMENODE_ID with corresponding
NamenodeID. On other nodes such as datanodes and client gateway machines the xml include points
an empty file.

Example:
{noformat}
<property>
<name>dfs.namenode.ids</name>
<value>nn1, nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address</name>
<value>host1:port</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nn1</name>
<value>host1:port</value>
</property>
<property>
<name>dfs.namenode.rpc-address.nn2</name>
<value>host2:port</value>
</property>
{noformat}


> Configuration changes for HA namenode
> -------------------------------------
>
>                 Key: HDFS-2231
>                 URL: https://issues.apache.org/jira/browse/HDFS-2231
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>             Fix For: HA branch (HDFS-1623)
>
>
> This jira tracks the changes required for configuring HA setup for namenodes.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message