hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Estes (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8809) HDFS fsck reports file corruption (Missing blocks) when HBase is running, with Hadoop 2.7.1 and HBase 1.1.1
Date Thu, 23 Jul 2015 16:17:06 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14639069#comment-14639069
] 

James Estes commented on HDFS-8809:
-----------------------------------

We are seeing very similar behavior with HBase 0.98.12 on Hadoop 2.6.0, though in our case,
the CORRUPT blocks are only reported when using the -openforwrite flag. It is indicating missing
blocks for WALs being written to, and the same 83 B you see here. Further, when we shut hbase
down, the report is healthy again. We did not see this behavior on HBase 0.98.12 and Hadoop
2.2.0

Here is our corrupt blocks report:

{noformat}
bin/hdfs fsck -openforwrite
Connecting to namenode via http://prod-nn2:50070
FSCK started by hadoop (auth:SIMPLE) from /10.73.3.15 for path / at Thu Jul 23 16:02:58 UTC
2015
/data/hbase/WALs/prod-h1,60020,1437667330408/prod-h1%2C60020%2C1437667330408.1437667332462
83 bytes, 1 block(s), OPENFORWRITE:
/data/hbase/WALs/prod-h1,60020,1437667330408/prod-h1%2C60020%2C1437667330408.1437667332462:
MISSING 1 blocks of total size 83 B./data/hbase/WALs/prod-h2,60020,1437667330869/prod-h2%2C60020%2C1437667330869.1437667332991
83 bytes, 1 block(s), OPENFORWRITE:
/data/hbase/WALs/prod-h2,60020,1437667330869/prod-h2%2C60020%2C1437667330869.1437667332991:
MISSING 1 blocks of total size 83 B./data/hbase/WALs/prod-h3,60020,1437667330149/prod-h3%2C60020%2C1437667330149.1437667332298
83 bytes, 1 block(s), OPENFORWRITE:
/data/hbase/WALs/prod-h3,60020,1437667330149/prod-h3%2C60020%2C1437667330149.1437667332298:
MISSING 1 blocks of total size 83 B./data/hbase/WALs/prod-h4,60020,1437667330435/prod-h4%2C60020%2C1437667330435.1437667332567
83 bytes, 1 block(s), OPENFORWRITE:
/data/hbase/WALs/prod-h4,60020,1437667330435/prod-h4%2C60020%2C1437667330435.1437667332567:
MISSING 1 blocks of total size 83 B./data/hbase/WALs/prod-h5,60020,1437667330436/prod-h5%2C60020%2C1437667330436.1437667332928
83 bytes, 1 block(s), OPENFORWRITE:
/data/hbase/WALs/prod-h5,60020,1437667330436/prod-h5%2C60020%2C1437667330436.1437667332928:
MISSING 1 blocks of total size 83 B./data/hbase/WALs/prod-h6,60020,1437667330455/prod-h6%2C60020%2C1437667330455.1437667332532
83 bytes, 1 block(s), OPENFORWRITE:
/data/hbase/WALs/prod-h6,60020,1437667330455/prod-h6%2C60020%2C1437667330455.1437667332532:
MISSING 1 blocks of total size 83 B./data/hbase/WALs/prod-h6,60020,1437667330455/prod-h6%2C60020%2C1437667330455.1437667333025.meta
83 bytes, 1 block(s), OPENFORWRITE:
/data/hbase/WALs/prod-h6,60020,1437667330455/prod-h6%2C60020%2C1437667330455.1437667333025.meta:
MISSING 1 blocks of total size 83 B./data/hbase/WALs/prod-h7,60020,1437667330742/prod-h7%2C60020%2C1437667330742.1437667333075
83 bytes, 1 block(s), OPENFORWRITE:
/data/hbase/WALs/prod-h7,60020,1437667330742/prod-h7%2C60020%2C1437667330742.1437667333075:
MISSING 1 blocks of total size 83 B./data/hbase/WALs/prod-h8,60020,1437667330731/prod-h8%2C60020%2C1437667330731.1437667332978
83 bytes, 1 block(s), OPENFORWRITE:
/data/hbase/WALs/prod-h8,60020,1437667330731/prod-h8%2C60020%2C1437667330731.1437667332978:
MISSING 1 blocks of total size 83 B............................................................................................
....................................................................................................
....................................................................................................
....................................................................................................
....................................................................................................
....................................................................................................
.......................................Status: CORRUPT
 Total size:	104358709960 B
 Total dirs:	517
 Total files:	639
 Total symlinks:		0
 Total blocks (validated):	1294 (avg. block size 80648152 B)
  ********************************
  CORRUPT FILES:	9
  MISSING BLOCKS:	9
  MISSING SIZE:		747 B
  ********************************
 Minimally replicated blocks:	1285 (99.30448 %)
 Over-replicated blocks:	0 (0.0 %)
 Under-replicated blocks:	0 (0.0 %)
 Mis-replicated blocks:		0 (0.0 %)
 Default replication factor:	3
 Average block replication:	2.9791346
 Corrupt blocks:		0
 Missing replicas:		0 (0.0 %)
 Number of data-nodes:		8
 Number of racks:		1
FSCK ended at Thu Jul 23 16:02:58 UTC 2015 in 121 milliseconds

{noformat}

> HDFS fsck reports file corruption (Missing blocks) when HBase is running, with Hadoop
2.7.1 and HBase 1.1.1
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8809
>                 URL: https://issues.apache.org/jira/browse/HDFS-8809
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: tools
>    Affects Versions: 2.7.0
>         Environment: Hadoop 2.7.1 and HBase 1.1.1, on SUSE11sp3 (other Linuxes not tested,
probably not platform-dependent).  This did NOT happen with Hadoop 2.4 and HBase 0.98.
>            Reporter: Sudhir Prakash
>
> Whenever HBase is running, the "hdfs fsck /"  reports four hbase-related files in the
path "hbase/data/WALs/" as CORRUPT. Even after letting the cluster sit idle for a couple hours,
it is still in the corrupt state.  If HBase is shut down, the problem goes away.  If HBase
is then restarted, the problem recurs.
> {code}
> hades1:/var/opt/teradata/packages # su hdfs
> hdfs@hades1:/var/opt/teradata/packages> hdfs fsck /
> Connecting to namenode via http://hades1.labs.teradata.com:50070/fsck?ugi=hdfs&path=%2F
> FSCK started by hdfs (auth:SIMPLE) from /39.0.8.2 for path / at Wed Jun 24 20:40:17 GMT
2015
> ...
> /apps/hbase/data/WALs/hades4.labs.teradata.com,16020,1435168292684/hades4.labs.teradata.com%2C16020%2C1435168292684.default.1435175500556:
MISSING 1 blocks of total size 83 B.
> /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466..meta.1435175562144.meta:
MISSING 1 blocks of total size 83 B.
> /apps/hbase/data/WALs/hades5.labs.teradata.com,16020,1435168290466/hades5.labs.teradata.com%2C16020%2C1435168290466.default.1435175498500:
MISSING 1 blocks of total size 83 B.
> /apps/hbase/data/WALs/hades6.labs.teradata.com,16020,1435168292373/hades6.labs.teradata.com%2C16020%2C1435168292373.default.1435175500301:
MISSING 1 blocks of total size 83 B..................................................................................................
> ....................................................................................................
> ....................................................................................................
> ........................................................................................Status:
CORRUPT
>  Total size:    723977553 B (Total open files size: 332 B)
>  Total dirs:    79
>  Total files:   388
>  Total symlinks:                0 (Files currently being written: 5)
>  Total blocks (validated):      387 (avg. block size 1870743 B) (Total open file blocks
(not validated): 4)
>   ********************************
>   UNDER MIN REPL'D BLOCKS:      4 (1.0335917 %)
>   dfs.namenode.replication.min: 1
>   CORRUPT FILES:        4
>   MISSING BLOCKS:       4
>   MISSING SIZE:         332 B
>   ********************************
>  Minimally replicated blocks:   387 (100.0 %)
>  Over-replicated blocks:        0 (0.0 %)
>  Under-replicated blocks:       0 (0.0 %)
>  Mis-replicated blocks:         0 (0.0 %)
>  Default replication factor:    3
>  Average block replication:     3.0
>  Corrupt blocks:                0
>  Missing replicas:              0 (0.0 %)
>  Number of data-nodes:          3
>  Number of racks:               1
> FSCK ended at Wed Jun 24 20:40:17 GMT 2015 in 7 milliseconds
> The filesystem under path '/' is CORRUPT
> hdfs@hades1:/var/opt/teradata/packages>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message