hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Corrupt HDFS and salvaging data
Date Fri, 09 May 2008 03:35:01 GMT

I have a case of a corrupt HDFS (according to bin/hadoop fsck) and I'm trying not to lose
the precious data in it.  I accidentally run bin/hadoop namenode -format on a *new DN* that
I just added to the cluster.  Is it possible for that to corrupt HDFS?  I also had to explicitly
kill DN daemons before that, because bin/stop-all.sh didn't stop them for some reason (it
always did so before).

Is there any way to salvage the data?  I have a 4-node cluster with replication factor of
3, though fsck reports lots of under-replicated blocks:

  CORRUPT FILES:        3355
  MISSING BLOCKS:       3462
  MISSING SIZE:         17708821225 B
 Minimally replicated blocks:   28802 (89.269775 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       17025 (52.76779 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     1.7750744
 Missing replicas:              17025 (29.727087 %)
 Number of data-nodes:          4
 Number of racks:               1

The filesystem under path '/' is CORRUPT

What can one do at this point to save the data?  If I run bin/hadoop fsck -move or -delete
will I lose some of the data?  Or will I simply end up with fewer block replicas and will
thus have to force re-balancing in order to get back to a "safe" number of replicas?

Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

View raw message