hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nishant Verma (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-11413) HDFS fsck command shows health as corrupt for '/'
Date Tue, 14 Feb 2017 08:02:41 GMT
Nishant Verma created HDFS-11413:
------------------------------------

             Summary: HDFS fsck command shows health as corrupt for '/'
                 Key: HDFS-11413
                 URL: https://issues.apache.org/jira/browse/HDFS-11413
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: Nishant Verma


I have open source hadoop version 2.7.3 cluster (2 Masters + 3 Slaves) installed on AWS EC2
instances. I am using the cluster to integrate it with Kafka Connect. 

The setup of cluster was done last month and setup of kafka connect was completed last fortnight.
Since then, we were able to operate the kafka topic records on our HDFS and do various operations.

Since last afternoon, I find that any kafka topic is not getting committed to the cluster.
When I tried to open the older files, I started getting below error. When I copy a new file
to the cluster from local, it comes and gets opened but after some time, again starts showing
similar IOException:

==========================================================
17/02/14 07:57:55 INFO hdfs.DFSClient: No node available for BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
file=/test/inputdata/derby.log
17/02/14 07:57:55 INFO hdfs.DFSClient: Could not obtain BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
from any node: java.io.IOException: No live nodes contain block BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
after checking nodes = [], ignoredNodes = null No live nodes contain current block Block locations:
Dead nodes: . Will get new block locations from namenode and retry...
17/02/14 07:57:55 WARN hdfs.DFSClient: DFS chooseDataNode: got # 1 IOException, will wait
for 499.3472970548959 msec.
17/02/14 07:57:55 INFO hdfs.DFSClient: No node available for BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
file=/test/inputdata/derby.log
17/02/14 07:57:55 INFO hdfs.DFSClient: Could not obtain BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
from any node: java.io.IOException: No live nodes contain block BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
after checking nodes = [], ignoredNodes = null No live nodes contain current block Block locations:
Dead nodes: . Will get new block locations from namenode and retry...
17/02/14 07:57:55 WARN hdfs.DFSClient: DFS chooseDataNode: got # 2 IOException, will wait
for 4988.873277172643 msec.
17/02/14 07:58:00 INFO hdfs.DFSClient: No node available for BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
file=/test/inputdata/derby.log
17/02/14 07:58:00 INFO hdfs.DFSClient: Could not obtain BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
from any node: java.io.IOException: No live nodes contain block BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
after checking nodes = [], ignoredNodes = null No live nodes contain current block Block locations:
Dead nodes: . Will get new block locations from namenode and retry...
17/02/14 07:58:00 WARN hdfs.DFSClient: DFS chooseDataNode: got # 3 IOException, will wait
for 8598.311122824263 msec.
17/02/14 07:58:09 WARN hdfs.DFSClient: Could not obtain block: BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
file=/test/inputdata/derby.log No live nodes contain current block Block locations: Dead nodes:
. Throwing a BlockMissingException
17/02/14 07:58:09 WARN hdfs.DFSClient: Could not obtain block: BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
file=/test/inputdata/derby.log No live nodes contain current block Block locations: Dead nodes:
. Throwing a BlockMissingException
17/02/14 07:58:09 WARN hdfs.DFSClient: DFS Read
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
file=/test/inputdata/derby.log
        at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:983)
        at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:642)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:882)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934)
        at java.io.DataInputStream.read(DataInputStream.java:100)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:119)
        at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:107)
        at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:102)
        at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
        at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
        at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
        at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
        at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
cat: Could not obtain block: BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
file=/test/inputdata/derby.log
==========================================================

When I do : hdfs fsck / , I get:
==========================================================
 Total size:    667782677 B
 Total dirs:    406
 Total files:   44485
 Total symlinks:                0
 Total blocks (validated):      43767 (avg. block size 15257 B)
  ********************************
  UNDER MIN REPL'D BLOCKS:      43766 (99.99772 %)
  dfs.namenode.replication.min: 1
  CORRUPT FILES:        43766
  MISSING BLOCKS:       43766
  MISSING SIZE:         667781648 B
  CORRUPT BLOCKS:       43766
  ********************************
 Minimally replicated blocks:   1 (0.0022848265 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     6.8544796E-5
 Corrupt blocks:                43766
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          3
 Number of racks:               1
FSCK ended at Tue Feb 14 07:59:10 UTC 2017 in 932 milliseconds


The filesystem under path '/' is CORRUPT
==========================================================

That means, all my files got corrupted somehow. 

I want to recover my HDFS and fix the corrupt health status. Also, I would like to understand,
how such an issue occurred suddenly and how to prevent it in future?

Many thanks,
Nishant Verma 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message