hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1973) HA: HDFS clients must handle namenode failover and switch over to the new active namenode.
Date Fri, 23 Sep 2011 00:05:30 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113045#comment-13113045

Aaron T. Myers commented on HDFS-1973:

Thanks a lot for the review, Todd. I'll attach an updated patch addressing most of your comments
in just a moment. Comments below.

bq. not sure if removing the DFSClient(Configuration conf) constructor is a good idea - it
was deprecated at 0.21 but since we haven't had a well-adopted release since then, seems we
should keep it around. (it's trivial to keep, right?)

Sure, but the whole of {{DFSClient}} is annotated {{@InterfaceAudience.Private}}. You still
think we should keep it around?

bq. some log messages left in: LOG.info("address of nn1: " + conf.get("dfs.atm.nn1"))

Whoops! Fixed. Any more that you noticed? Or just that one? :)

bq. some of the lines seem to have a lot of columns - try to wrap at 80?

Like which? I wasn't super strict about wrapping at 80, but I'm pretty sure none go over 90.

bq. maybe rename this parameter to nameNodeUri?


bq. Shouldn't that have a "." at the end? It seems authorities are appended without a "."
in the middle.

Yep, great catch. Fixed.

bq. why? It seems we should at least be logging a WARN if this is configured to point to a
class that can't be loaded, if not re-throwing as IOE.

Good point. I've changed it to throw an {{IOE}} with a descriptive error message.

bq. I'm not sure that o.a.h.h.protocol is the right package for ConfiguredFailoverProxyProvider
- maybe it belongs in the HA package?

Good point. I've moved it to o.a.h.server.namenode.ha.

bq. missing license headers on many of the new files

Fixed. It's just two new files.

bq. these will throw NPE in the case of misconfiguration. Should at least throw an exception
indicating what the mistaken config is.


bq. Also, it seems more sensible to have a single config like "dfs.ha.namenode.addresses"
with comma-separated URIs, since there's nothing explicitly tied to having exactly 2 NNs here,

Good point. Fixed.

bq. A general note: this failover proxy provider doesn't get passed enough info to choose
different configs based on the "logical URI". For example, if I have two clusters, foo and
bar I'd expect to be able to configure dfs.ha.namenodes.foo separately from dfs.ha.namenodes.bar
after setting both of dfs.client.failover.proxy.provider.foo|bar.

Good point. Mind if we address this in a follow-up JIRA? It seems like the solution would
be to replace "dfs.ha.namenode.addresses" with something like "dfs.ha.namenode.addresses.<logical-uri>",
so as to be able to support configuring the addresses of multiple HA clusters.

bq. Can we separate the addition of Idempotent annotations to a different patch?

Certainly. I've removed all of them from this patch except for {{getBlockLocations}}, which
is necessary for the test to work.

bq. file a followup JIRA or use "TODO" so we don't forget about this case?

Changed to:

// TODO(HA): This test should probably be made to fail if a client fails over
// to talk to an NN with a different block pool id. Once failover between
// active/standy in a single block pool is implemented, this test should be
// changed to exercise that.

> HA: HDFS clients must handle namenode failover and switch over to the new active namenode.
> ------------------------------------------------------------------------------------------
>                 Key: HDFS-1973
>                 URL: https://issues.apache.org/jira/browse/HDFS-1973
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Suresh Srinivas
>            Assignee: Aaron T. Myers
>         Attachments: HDFS-1973-HDFS-1623.patch, HDFS-1973-HDFS-1623.patch, hdfs-1973.0.patch
> During failover, a client must detect the current active namenode failure and switch
over to the new active namenode. The switch over might make use of IP failover or some thing
more elaborate such as zookeeper to discover the new active.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message