hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2839) HA: Remove need for client configuration of nameservice ID
Date Wed, 28 Mar 2012 16:11:30 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240500#comment-13240500

Daryn Sharp commented on HDFS-2839:

Comments.  They'll reference and span both solution A & B to hopefully provide clarity.

* NN has a logical name that is authority in URI (hdfs://logicalName/path)
* (Requirement B.5) The non-HA NN’s DNS name is the logical name of NN.

I think this may be your intent, so would you please clarify/confirm?  The logical name (URI's
authority) for the NN should always be, for both HA and non-HA, a DNS name with a primary
mapping to the main NN.  The logical name may be a CNAME (see below) in the case of HA.  This
will allow both HA and non-HA aware clients to access the cluster.  The non-HA aware clients
just won't failover. 

* The LogicalName is the DNS name that maps to a single IP which failsover
* DNS-Resolver - DNS-Name to IP or IPs (single IP in case of IP-Failover)

These two statements seem contradictory about whether the the logical name (URI authority)
has a dns mapping to 1 or many hosts.  I'm assuming that the "single IP" approach would rely
on config settings for failover support?

Here's how I would envision a dns configuration to support both HA and non-HA:
* A record for logical name: nn.domain -> IP
** Non-HA client works as it does now.
** HA client has a single resolution so there's no failover.
* CNAME for logical name: nn.domain -> nn1.domain & nn2.domain
** The non-HA client works as it does now.  The CNAME is resolved to the primary address (nn1.domain).
 No code change required.
** The HA client can specifically query for all resolutions to build the failover list.

In fact, the HA aware client could be "smart" and instantiate the HA RPC proxy only if the
logical name has multiple resolutions.  The HA aware client resolves the logical name with
{{getAllByName}}, instead of {{getByName}}, to find the multiple mappings for HA.

Regarding cross cluster access:
* Cross cluster access must be supported
* Cross cluster – The logical to IP mapping must be available across clusters
* ConfigFile-resolver - the mapping in the config file - this config file will need to be
be available in all clusters, for all clusters to allow cross cluster access.

I'm uneasy about propagating the current model where clients require a lot of config info
about remote clusters.  It becomes a maintenance burden to keep them in sync, more so when
some users have their configs.  Favoring the dns/resolver approach should minimize the need
to sync all cluster configs for HA.
> HA: Remove need for client configuration of nameservice ID
> ----------------------------------------------------------
>                 Key: HDFS-2839
>                 URL: https://issues.apache.org/jira/browse/HDFS-2839
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: ha, hdfs client, name-node
>    Affects Versions: 0.24.0
>            Reporter: Jitendra Nath Pandey
>            Assignee: Jitendra Nath Pandey
> The fully qualified path from an ha cluster, won't be usable from a different cluster
that doesn't know about that particular namespace id.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message