hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo Nicholas Sze (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min > 1 && missing replicas && NN restart
Date Tue, 24 Feb 2015 16:38:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335054#comment-14335054
] 

Tsz Wo Nicholas Sze commented on HDFS-7537:
-------------------------------------------

> In Allen’s comment, the Mock-up output shows status as HEALTHY when numUnderMinimalRelicatedBlocks
> 0. ...

I see.  Let's keep showing HEALTHY for the moment.  When numUnderMinimalRelicatedBlocks >
0 and there is no missing/corrupted block, all under minimal replicated blocks have at least
one good replica so that they can be replicated and there is no data loss.  It makes sense
to consider the file system as healthy.  Currently, we only have two statuses, HEALTHY and
CORRUPT.  In the future, we may want to add one more status for this case.

BTW, there is a typo: "numUnderMinimalRelicatedBlocks" should be "numUnderMinimalReplicatedBlocks"

> fsck is confusing when dfs.namenode.replication.min > 1 && missing replicas
&& NN restart
> -----------------------------------------------------------------------------------------
>
>                 Key: HDFS-7537
>                 URL: https://issues.apache.org/jira/browse/HDFS-7537
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Allen Wittenauer
>            Assignee: GAO Rui
>         Attachments: HDFS-7537.1.patch, dfs-min-2-fsck.png, dfs-min-2.png
>
>
> If minimum replication is set to 2 or higher and some of those replicas are missing and
the namenode restarts, it isn't always obvious that the missing replicas are the reason why
the namenode isn't leaving safemode.  We should improve the output of fsck and the web UI
to make it obvious that the missing blocks are from unmet replicas vs. completely/totally
missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message