hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1973) HA: HDFS clients must handle namenode failover and switch over to the new active namenode.
Date Thu, 18 Aug 2011 19:54:28 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13087225#comment-13087225
] 

Suresh Srinivas commented on HDFS-1973:
---------------------------------------

Sorry for the late comment. I had been traveling.

Before {{Cases to support}}, could we add a section like this:
>>
On failover, clients need the address of the new active. This could be done by:
# Contacting zookeeper to get the current active NN.
# Alternatively client gets the address of both the namenodes. Tries them one at a time until
it gets connected to the new active.
# For setups using IP failover, clients always use the same VIP/failover address, which moves
to active.
>>
Given this, I am not sure about the {{Cases to support}}:
Proxy based client failover is an implementation details. It still needs to figure out the
new active based on one of the schemes above. I am not very clear on Configuration based support.
Do you mean here, client config will be changed to point to the new active? DNS SRV records
are also unnecessary given our config would have both the namenode addresses.

+1 for logical URI. We could consider merging this requirement with HDFS-2231 to do this.


Logical URI is needed for identifying a nameservice and not cluster, since federation supports
multiple namenodes with in a cluster.  We could use the concept of nameservice, introduced
in federation for that?  So URI would be nameservice1.foo.com. nameservices1 maps to nn1,
nn2.

As regards to viewfs, I think this scheme will work for viewfs. The viewfs mounttables will
point to the logical URI, which in turn will use the mechanism you are proposing.

Why should failover method be based on URI cluster part? Can it be a single mechanism across
all the nameservices? Hence change the parameter to dfs.client.ha.failover.method?

These are my early thoughts. Some questions I am left with are:
# The scheme you have defined works only for RPC protocols. How about HTTP?
# I am not sure why logical URI is required for VIP/failover based setup.

We could continue to add more details.


> HA: HDFS clients must handle namenode failover and switch over to the new active namenode.
> ------------------------------------------------------------------------------------------
>
>                 Key: HDFS-1973
>                 URL: https://issues.apache.org/jira/browse/HDFS-1973
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Suresh Srinivas
>            Assignee: Aaron T. Myers
>
> During failover, a client must detect the current active namenode failure and switch
over to the new active namenode. The switch over might make use of IP failover or some thing
more elaborate such as zookeeper to discover the new active.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message