hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7537) dfs.namenode.replication.min > 1 && missing replicas && NN restart is confusing
Date Thu, 18 Dec 2014 22:10:13 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252383#comment-14252383
] 

Allen Wittenauer commented on HDFS-7537:
----------------------------------------

Mock-up of an fsck that alerts when min rep hasn't actually been met:

{code}
Status: HEALTHY
 Total size:    236 B
 Total dirs:    1
 Total files:   1
 Total symlinks:                0
 Total blocks (validated):      1 (avg. block size 236 B)
  ********************************
  UNDER MIN REPL'D BLOCKS:      1 (100.0 %)
  ********************************
 Minimally replicated blocks:   0 (0.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       1 (100.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     1.0
 Corrupt blocks:                0
 Missing replicas:              2 (66.666664 %)
 Number of data-nodes:          1
 Number of racks:               1
{code}

With all datanodes down (and therefore triggering corrupt/missing blocks):

{code}
Status: CORRUPT
 Total size:	236 B
 Total dirs:	1
 Total files:	1
 Total symlinks:		0
 Total blocks (validated):	1 (avg. block size 236 B)
  ********************************
  UNDER MIN REPL'D BLOCKS: 	1 (100.0 %)
  CORRUPT FILES:	1
  MISSING BLOCKS:	1
  MISSING SIZE:		236 B
  CORRUPT BLOCKS: 	1
  ********************************
 Minimally replicated blocks:	0 (0.0 %)
 Over-replicated blocks:	0 (0.0 %)
 Under-replicated blocks:	0 (0.0 %)
 Mis-replicated blocks:		0 (0.0 %)
 Default replication factor:	3
 Average block replication:	0.0
 Corrupt blocks:		1
 Missing replicas:		0
 Number of data-nodes:		0
 Number of racks:		0
FSCK ended at Thu Dec 18 14:08:25 PST 2014 in 13 milliseconds
{code}

> dfs.namenode.replication.min > 1 && missing replicas && NN restart
is confusing
> -------------------------------------------------------------------------------
>
>                 Key: HDFS-7537
>                 URL: https://issues.apache.org/jira/browse/HDFS-7537
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Allen Wittenauer
>         Attachments: dfs-min-2-fsck.png, dfs-min-2.png
>
>
> If minimum replication is set to 2 or higher and some of those replicas are missing and
the namenode restarts, it isn't always obvious that the missing replicas are the reason why
the namenode isn't leaving safemode.  We should improve the output of fsck and the web UI
to make it obvious that the missing blocks are from unmet replicas vs. completely/totally
missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message