hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Shvachko <...@yahoo-inc.com>
Subject Re: restarting datanode corrupts the hdfs
Date Thu, 04 Sep 2008 00:23:20 GMT
I can see 3 reasons for that:
1. dfs.data.dir is pointing to a wrong data-node storage directory, or
2. somebody manually moved directory "hadoop" into /home/hadoop/dfs/tmp/,
which is supposed to contain only block files named blk_<number>
3. There is some collision of configuration variables so that the same directory
/home/hadoop/dfs/ is used by different servers (e.g. data-node and task tracker)
on your single node cluster.

To save hdfs data you can manually remove "hadoop" from /home/hadoop/dfs/tmp/
and then restart the data-node.
Or you can also manully remove "tmp" from /home/hadoop/dfs/.
In the latter case you risk to loose some latest blocks, but not the whole system.

--Konstantin

Barry Haddow wrote:
> Hi 
> 
> Since upgrading to 0.18.0 I've noticed that restarting the datanode corrupts 
> the hdfs so that the only option is to delete it and start again. I'm running 
> hadoop in distributed mode, on a single host. It runs as the user hadoop and 
> the hdfs is contained in a directory /home/hadoop/dfs.
> 
> When I restart hadoop using start-all.sh the datanode fails with the following 
> message:
> 
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.18.0
> STARTUP_MSG:   build = 
> http://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18 -r 686010; 
> compiled by 'hadoopqa' on Thu Aug 14 19:48:33 UTC 2008
> ************************************************************/
> 2008-09-01 12:06:55,871 ERROR org.apache.hadoop.dfs.DataNode: 
> java.io.IOException: Found /home/hadoop/dfs/tmp/hadoop 
> in /home/hadoop/dfs/tmp but it is not a file.
>         at 
> org.apache.hadoop.dfs.FSDataset$FSVolume.recoverDetachedBlocks(FSDataset.java:437)
>         at org.apache.hadoop.dfs.FSDataset$FSVolume.<init>(FSDataset.java:310)
>         at org.apache.hadoop.dfs.FSDataset.<init>(FSDataset.java:671)
>         at org.apache.hadoop.dfs.DataNode.startDataNode(DataNode.java:277)
>         at org.apache.hadoop.dfs.DataNode.<init>(DataNode.java:190)
>         at org.apache.hadoop.dfs.DataNode.makeInstance(DataNode.java:2987)
>         at 
> org.apache.hadoop.dfs.DataNode.instantiateDataNode(DataNode.java:2942)
>         at org.apache.hadoop.dfs.DataNode.createDataNode(DataNode.java:2950)
>         at org.apache.hadoop.dfs.DataNode.main(DataNode.java:3072)
> 
> 2008-09-01 12:06:55,872 INFO org.apache.hadoop.dfs.DataNode: SHUTDOWN_MSG:
> 
> Running an fsck on the hdfs shows that it is corrupt, and the only way to fix 
> it seems to be to delete it and reformat.
> 
> Any suggestions?
> regards
> Barry
> 

Mime
View raw message