hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5138) Support HDFS upgrade in HA
Date Mon, 27 Jan 2014 21:41:50 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883368#comment-13883368
] 

Suresh Srinivas commented on HDFS-5138:
---------------------------------------

{quote}
Hi Suresh, it's obviously fine that you're busy (we all are) but in the future please just
let me know that you intend to review it and that we should hold off for committing it for
a bit. I reached out to you more than once last week to ask about a review timeline and never
heard back from you, so I asked Todd to commit it (I'm traveling at the moment) given the
silence.
{quote}
[~atm], we talked about this last on Friday Jan 16th over the phone right. I did tell you
that JournalNode potentially losing editlogs.

bq. This scenario isn't possible as you described because either the pre-upgrade or upgrade
stages (depending upon when the original failure happened) will fail to rename the dir if
it already exists.
Is that correct? Did you check it? Java File#renameTo() is platform dependent. The following
code always renames the directories (on my MAC):

{code}
public static void main(String[] args) {
    File f1 = new File("/tmp/dir1");
    File f2 = new File("/tmp/dir2");
    f1.mkdir();
    f2.mkdir();
    System.out.println(f1 + (f1.exists() ? " exists" : " does not exist"));
    System.out.println(f2 + (f2.exists() ? " exists" : " does not exist"));
    f1.renameTo(f2);
    System.out.println("Renamed " + f1 + " to " + f2);
    System.out.println(f1 + (f1.exists() ? " exists" : " does not exist"));
    System.out.println(f2 + (f2.exists() ? " exists" : " does not exist"));
  }
{code}

Related question. Lets say even if the rename fails, how does user recover from that condition?
I brought up several scenarios related to that in preupgrade, upgrade, and finalize. How do
we handle finalize being done successfully done on one namenode and not the other?

> Support HDFS upgrade in HA
> --------------------------
>
>                 Key: HDFS-5138
>                 URL: https://issues.apache.org/jira/browse/HDFS-5138
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.1.1-beta
>            Reporter: Kihwal Lee
>            Assignee: Aaron T. Myers
>            Priority: Blocker
>             Fix For: 3.0.0
>
>         Attachments: HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch,
HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch,
HDFS-5138.patch, hdfs-5138-branch-2.txt
>
>
> With HA enabled, NN wo't start with "-upgrade". Since there has been a layout version
change between 2.0.x and 2.1.x, starting NN in upgrade mode was necessary when deploying 2.1.x
to an existing 2.0.x cluster. But the only way to get around this was to disable HA and upgrade.

> The NN and the cluster cannot be flipped back to HA until the upgrade is finalized. If
HA is disabled only on NN for layout upgrade and HA is turned back on without involving DNs,
things will work, but finaliizeUpgrade won't work (the NN is in HA and it cannot be in upgrade
mode) and DN's upgrade snapshots won't get removed.
> We will need a different ways of doing layout upgrade and upgrade snapshot.  I am marking
this as a 2.1.1-beta blocker based on feedback from others.  If there is a reasonable workaround
that does not increase maintenance window greatly, we can lower its priority from blocker
to critical.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message