hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philippe Kernévez <pkerne...@octo.com>
Subject Re: HDFS fsck command giving health as corrupt for '/'
Date Wed, 15 Feb 2017 09:31:13 GMT
Hi Nishant,

You namenode are probably unable to comunicate with your datanode. Did you
restart all the HDFS services ?

Regards,
Philipp

On Tue, Feb 14, 2017 at 10:43 AM, Nishant Verma <nishant.verma0702@gmail.com
> wrote:

> Hi
>
> I have open source hadoop version 2.7.3 cluster (2 Masters + 3 Slaves)
> installed on AWS EC2 instances. I am using the cluster to integrate it with
> Kafka Connect.
>
> The setup of cluster was done last month and setup of kafka connect was
> completed last fortnight. Since then, we were able to operate the kafka
> topic records on our HDFS and do various operations.
>
> Since last afternoon, I find that any kafka topic is not getting committed
> to the cluster. When I tried to open the older files, I started getting
> below error. When I copy a new file to the cluster from local, it comes and
> gets opened but after some time, again starts showing similar IOException:
>
> 17/02/14 07:57:55 INFO hdfs.DFSClient: No node available for BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
file=/test/inputdata/derby.log
> 17/02/14 07:57:55 INFO hdfs.DFSClient: Could not obtain BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
from any node: java.io.IOException: No live nodes contain block BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
after checking nodes = [], ignoredNodes = null No live nodes contain current block Block locations:
Dead nodes: . Will get new block locations from namenode and retry...
> 17/02/14 07:57:55 WARN hdfs.DFSClient: DFS chooseDataNode: got # 1 IOException, will
wait for 499.3472970548959 msec.
> 17/02/14 07:57:55 INFO hdfs.DFSClient: No node available for BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
file=/test/inputdata/derby.log
> 17/02/14 07:57:55 INFO hdfs.DFSClient: Could not obtain BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
from any node: java.io.IOException: No live nodes contain block BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
after checking nodes = [], ignoredNodes = null No live nodes contain current block Block locations:
Dead nodes: . Will get new block locations from namenode and retry...
> 17/02/14 07:57:55 WARN hdfs.DFSClient: DFS chooseDataNode: got # 2 IOException, will
wait for 4988.873277172643 msec.
> 17/02/14 07:58:00 INFO hdfs.DFSClient: No node available for BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
file=/test/inputdata/derby.log
> 17/02/14 07:58:00 INFO hdfs.DFSClient: Could not obtain BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
from any node: java.io.IOException: No live nodes contain block BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
after checking nodes = [], ignoredNodes = null No live nodes contain current block Block locations:
Dead nodes: . Will get new block locations from namenode and retry...
> 17/02/14 07:58:00 WARN hdfs.DFSClient: DFS chooseDataNode: got # 3 IOException, will
wait for 8598.311122824263 msec.
> 17/02/14 07:58:09 WARN hdfs.DFSClient: Could not obtain block: BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
file=/test/inputdata/derby.log No live nodes contain current block Block locations: Dead nodes:
. Throwing a BlockMissingException
> 17/02/14 07:58:09 WARN hdfs.DFSClient: Could not obtain block: BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
file=/test/inputdata/derby.log No live nodes contain current block Block locations: Dead nodes:
. Throwing a BlockMissingException
> 17/02/14 07:58:09 WARN hdfs.DFSClient: DFS Read
> org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
file=/test/inputdata/derby.log
>         at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:983)
>         at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:642)
>         at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:882)
>         at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:934)
>         at java.io.DataInputStream.read(DataInputStream.java:100)
>         at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:85)
>         at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
>         at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:119)
>         at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:107)
>         at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:102)
>         at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:317)
>         at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:289)
>         at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:271)
>         at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:255)
>         at org.apache.hadoop.fs.shell.Command.processRawArguments(Command.java:201)
>         at org.apache.hadoop.fs.shell.Command.run(Command.java:165)
>         at org.apache.hadoop.fs.FsShell.run(FsShell.java:287)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>         at org.apache.hadoop.fs.FsShell.main(FsShell.java:340)
> cat: Could not obtain block: BP-1831277630-10.16.37.124-1484306078618:blk_1073793876_55013
file=/test/inputdata/derby.log
>
> When I do : hdfs fsck / , I get:
>
> Total size:    667782677 B
>  Total dirs:    406
>  Total files:   44485
>  Total symlinks:                0
>  Total blocks (validated):      43767 (avg. block size 15257 B)
>   ********************************
>   UNDER MIN REPL'D BLOCKS:      43766 (99.99772 %)
>   dfs.namenode.replication.min: 1
>   CORRUPT FILES:        43766
>   MISSING BLOCKS:       43766
>   MISSING SIZE:         667781648 B
>   CORRUPT BLOCKS:       43766
>   ********************************
>  Minimally replicated blocks:   1 (0.0022848265 %)
>  Over-replicated blocks:        0 (0.0 %)
>  Under-replicated blocks:       0 (0.0 %)
>  Mis-replicated blocks:         0 (0.0 %)
>  Default replication factor:    3
>  Average block replication:     6.8544796E-5
>  Corrupt blocks:                43766
>  Missing replicas:              0 (0.0 %)
>  Number of data-nodes:          3
>  Number of racks:               1
> FSCK ended at Tue Feb 14 07:59:10 UTC 2017 in 932 milliseconds
>
>
> The filesystem under path '/' is CORRUPT
>
> That means, all my files got corrupted somehow.
>
> I want to recover my HDFS and fix the corrupt health status. Also, I would
> like to understand, how such an issue occurred suddenly and how to prevent
> it in future?
>
>
> Thanks
>
> Nishant Verma
>



-- 
Philippe Kernévez



Directeur technique (Suisse),
pkernevez@octo.com
+41 79 888 33 32

Retrouvez OCTO sur OCTO Talk : http://blog.octo.com
OCTO Technology http://www.octo.com

Mime
View raw message