hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6440) Support more than 2 NameNodes
Date Wed, 13 May 2015 22:15:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14542834#comment-14542834

Aaron T. Myers commented on HDFS-6440:

bq. By setting the seed, you get the same sequence nn failures. So one seed would do 1->2->1->3,
while another might do 1->3->2->1. Then, with the seed you could reproduce the series
of failovers in the same order, which seems like a laudable goal for the test- especially
when trying to debug weird error cases. Unless I'm missing something?

Right, I get the intended purpose, but one of us must be missing something because I still
think there's some funny stuff going on with the {{FAILOVER_SEED}} variable. :)

In the latest patch, you'll see that the variable {{FAILOVER_SEED}} is used in the following

# Statically declare {{FAILOVER_SEED}} and initialize it to the value of {{System.currentTimeMillis()}}
# Statically create {{failoverRandom}} to be a new {{Random}} object, initialized with the
value of {{FAILOVER_SEED}}.
# In a static block, log the value of {{FAILOVER_SEED}}.
# In {{doWriteOverFailoverTest}}, reset the value of {{FAILOVER_SEED}} to again be {{System.currentTimeMillis()}}.
# Immediately thereafter in {{doWriteOverFailoverTest}}, log the new value of {{FAILOVER_SEED}}.

Note that there is no step 6 that resets {{failoverRandom}} to use the new value of {{FAILOVER_SEED}}
that was set in step 4, nor is {{FAILOVER_SEED}} used for anything else after step 5. Thus,
unless I'm missing something, seems like steps 4 and 5 are at least superfluous, and at worst
misleading since the test logs will contain a message about using a random seed that is in
fact never used.

> Support more than 2 NameNodes
> -----------------------------
>                 Key: HDFS-6440
>                 URL: https://issues.apache.org/jira/browse/HDFS-6440
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: auto-failover, ha, namenode
>    Affects Versions: 2.4.0
>            Reporter: Jesse Yates
>            Assignee: Jesse Yates
>             Fix For: 3.0.0
>         Attachments: Multiple-Standby-NameNodes_V1.pdf, hdfs-6440-cdh-4.5-full.patch,
hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v1.patch, hdfs-6440-trunk-v3.patch, hdfs-multiple-snn-trunk-v0.patch
> Most of the work is already done to support more than 2 NameNodes (one active, one standby).
This would be the last bit to support running multiple _standby_ NameNodes; one of the standbys
should be available for fail-over.
> Mostly, this is a matter of updating how we parse configurations, some complexity around
managing the checkpointing, and updating a whole lot of tests.

This message was sent by Atlassian JIRA

View raw message