hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7537) fsck is confusing when dfs.namenode.replication.min > 1 && missing replicas && NN restart
Date Tue, 24 Feb 2015 21:53:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335523#comment-14335523
] 

Allen Wittenauer commented on HDFS-7537:
----------------------------------------

bq. When numUnderMinimalRelicatedBlocks > 0 and there is no missing/corrupted block, all
under minimal replicated blocks have at least one good replica so that they can be replicated
and there is no data loss. It makes sense to consider the file system as healthy.

Exactly this.

I made a prototype to play with.  One of things I did was put the number of blocks that didn't
meet the replication minimum surrounded by the asterisks that the corrupted output did.  This
made it absolutely crystal clear why the NN wasn't coming out of safemode.

> fsck is confusing when dfs.namenode.replication.min > 1 && missing replicas
&& NN restart
> -----------------------------------------------------------------------------------------
>
>                 Key: HDFS-7537
>                 URL: https://issues.apache.org/jira/browse/HDFS-7537
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Allen Wittenauer
>            Assignee: GAO Rui
>         Attachments: HDFS-7537.1.patch, dfs-min-2-fsck.png, dfs-min-2.png
>
>
> If minimum replication is set to 2 or higher and some of those replicas are missing and
the namenode restarts, it isn't always obvious that the missing replicas are the reason why
the namenode isn't leaving safemode.  We should improve the output of fsck and the web UI
to make it obvious that the missing blocks are from unmet replicas vs. completely/totally
missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message