hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6289) HA failover can fail if there are pending DN messages for DNs which no longer exist
Date Wed, 30 Apr 2014 07:10:17 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985238#comment-13985238
] 

Hadoop QA commented on HDFS-6289:
---------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12642380/HDFS-6289.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 2 new or modified
test files.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

    {color:green}+1 eclipse:eclipse{color}.  The patch built with eclipse:eclipse.

    {color:green}+1 findbugs{color}.  The patch does not introduce any new Findbugs (version
1.3.9) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:red}-1 core tests{color}.  The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs:

                  org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery

    {color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6771//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6771//console

This message is automatically generated.

> HA failover can fail if there are pending DN messages for DNs which no longer exist
> -----------------------------------------------------------------------------------
>
>                 Key: HDFS-6289
>                 URL: https://issues.apache.org/jira/browse/HDFS-6289
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 2.4.0
>            Reporter: Aaron T. Myers
>            Assignee: Aaron T. Myers
>            Priority: Critical
>         Attachments: HDFS-6289.patch, HDFS-6289.patch
>
>
> In an HA setup, the standby NN may receive messages from DNs for blocks which the standby
NN is not yet aware of. It queues up these messages and replays them when it next reads from
the edit log or fails over. On a failover, all of these pending DN messages must be processed
successfully in order for the failover to succeed. If one of these pending DN messages refers
to a DN storageId that no longer exists (because the DN with that transfer address has been
reformatted and has re-registered with the same transfer address) then on transition to active
the NN will not be able to process this DN message and will suicide with an error like the
following:
> {noformat}
> 2014-04-25 14:23:17,922 FATAL namenode.NameNode (NameNode.java:doImmediateShutdown(1525))
- Error encountered requiring NN shutdown. Shutting down immediately.
> java.io.IOException: Cannot mark blk_1073741825_900(stored=blk_1073741825_1001) as corrupt
because datanode 127.0.0.1:33324 does not exist
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message