hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6094) The same block can be counted twice towards safe mode threshold
Date Fri, 14 Mar 2014 02:24:42 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13934472#comment-13934472
] 

Jing Zhao commented on HDFS-6094:
---------------------------------

Maybe another issue with the current code is that when an incremental block report comes before
the full block report, if the stored block state is COMMITTED, we may increase the safemode
total block number while not increase the safe block count. In that case I'm not sure if the
NN can get stuck in the safemode.

> The same block can be counted twice towards safe mode threshold
> ---------------------------------------------------------------
>
>                 Key: HDFS-6094
>                 URL: https://issues.apache.org/jira/browse/HDFS-6094
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.4.0
>            Reporter: Arpit Agarwal
>            Assignee: Arpit Agarwal
>         Attachments: TestHASafeMode-output.txt
>
>
> {{BlockManager#addStoredBlock}} can cause the same block can be counted towards safe
mode threshold. We see this manifest via {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}}
failures on Ubuntu. More details to follow in a comment.
> Exception details:
> {code}
>   Time elapsed: 12.874 sec  <<< FAILURE!
> java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported blocks
7 has reached the threshold 0.9990 of total blocks 6. The number of live datanodes 3 has reached
the minimum number 0. Safe mode will be turned off automatically in 28 seconds.'
>         at org.junit.Assert.fail(Assert.java:93)
>         at org.junit.Assert.assertTrue(Assert.java:43)
>         at org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
>         at org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message