hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Hadoop recovery test
Date Tue, 18 Sep 2012 02:43:52 GMT
Hi Artem,

You are running 1 DN in this cluster from what I see, and hence you
can ignore the reports that go: Under replicated
blk_7701720691642589882_1086. Target Replicas is 3 but found 1
replica(s).

The two truly missing blocks are:

/hdfs/hadoop/tmp/mapred/system/jobtracker.info: MISSING 1 blocks
/user/hduser/teragen-out/part-00000: MISSING 1 blocks

Which may be cause of those being written at the time of your copy of
the fsimage and edits (thats a wrong way to go about it, btw - you
should configure for redundant writes such that you also sustain
failures, not copy it periodically - thats not a consistent way to
keep a backup, and you can rather go for dfsadmin methods to
fetchImage instead). Does that sound likely?

On Tue, Sep 18, 2012 at 3:08 AM, Artem Ervits <are9004@nyp.org> wrote:
> Hello all,
>
>
>
> I am testing the Hadoop recovery as per
> http://wiki.apache.org/hadoop/NameNode document. But instead of using an NFS
> share, I am copying to another directory. Then when I shut down the cluster,
> I scp that directory to another server and start Hadoop cluster using that
> machine as the namenode. I see in the log that some blocks are corrupt
> and/or missing. Do I have to wait for replication to recover all blocks or
> am I doing something else altogether? I am using Hadoop 1.0.3. Can someone
> point me to a more detailed document than the wiki in case I’m doing
> something wrong.
>
>
>
> p.s. if I restart the cluster using the original namenode, filesystem
> reports as healthy.
>
>
>
> Thank you.
>
>
>
> .
>
> /hdfs/hadoop/tmp/mapred/system/jobtracker.info: CORRUPT block
> blk_9043419219670949307
>
>
>
> /hdfs/hadoop/tmp/mapred/system/jobtracker.info: MISSING 1 blocks of total
> size 4 B...
>
> /user/hduser/teragen/_logs/history/job_201209120941_0002_1347458152167_hduser_TeraGen:
> Under replicated blk_-976282286234272458_1079. Target Replicas is 3 but
> found 1 replica(s).
>
> .
>
> /user/hduser/teragen/_logs/history/job_201209120941_0002_conf.xml:  Under
> replicated blk_137658109390447967_1075. Target Replicas is 3 but found 1
> replica(s).
>
> .
>
> /user/hduser/teragen/_partition.lst:  Under replicated
> blk_-3005280481530403302_1080. Target Replicas is 3 but found 1 replica(s).
>
> .
>
> /user/hduser/teragen/part-00000:  Under replicated
> blk_-7008813028808832816_1077. Target Replicas is 3 but found 1 replica(s).
>
> .
>
> /user/hduser/teragen/part-00001:  Under replicated
> blk_-5256967771026054061_1078. Target Replicas is 3 but found 1 replica(s).
>
> ..
>
> /user/hduser/teragen-out/_logs/history/job_201209120941_0003_1347458249920_hduser_TeraSort:
> Under replicated blk_1137779303840586677_1089. Target Replicas is 3 but
> found 1 replica(s).
>
> .
>
> /user/hduser/teragen-out/_logs/history/job_201209120941_0003_conf.xml:
> Under replicated blk_7701720691642589882_1086. Target Replicas is 3 but
> found 1 replica(s).
>
> .
>
> /user/hduser/teragen-out/part-00000: CORRUPT block blk_8059469267617478950
>
>
>
> /user/hduser/teragen-out/part-00000: MISSING 1 blocks of total size 1000000
> B...
>
> /user/hduser/teragen-validate/_logs/history/job_201209120941_0004_1347458495941_hduser_TeraValidate:
> Under replicated blk_5680565744062298575_1098. Target Replicas is 3 but
> found 1 replica(s).
>
> .
>
> /user/hduser/teragen-validate/_logs/history/job_201209120941_0004_conf.xml:
> Under replicated blk_1566253937037013126_1095. Target Replicas is 3 but
> found 1 replica(s).
>
> .Status: CORRUPT
>
> Total size:    1050720258 B
>
> Total dirs:    39
>
> Total files:   32
>
> Total blocks (validated):      42 (avg. block size 25017149 B)
>
>   ********************************
>
>   CORRUPT FILES:        2
>
>   MISSING BLOCKS:       2
>
>   MISSING SIZE:         1000004 B
>
>   CORRUPT BLOCKS:       2
>
>   ********************************
>
> Minimally replicated blocks:   40 (95.2381 %)
>
> Over-replicated blocks:        0 (0.0 %)
>
> Under-replicated blocks:       40 (95.2381 %)
>
> Mis-replicated blocks:         0 (0.0 %)
>
> Default replication factor:    3
>
> Average block replication:     0.95238096
>
> Corrupt blocks:                2
>
> Missing replicas:              80 (200.0 %)
>
> Number of data-nodes:          1
>
> Number of racks:               1
>
> FSCK ended at Mon Sep 17 17:29:08 EDT 2012 in 21 milliseconds
>
>
>
>
>
> The filesystem under path '/' is CORRUPT
>
>
>
>
>
> Artem Ervits
>
> Data Analyst
>
> New York Presbyterian Hospital
>
>
>
>
> ________________________________
> This electronic message is intended to be for the use only of the named
> recipient, and may contain information that is confidential or privileged.
> If you are not the intended recipient, you are hereby notified that any
> disclosure, copying, distribution or use of the contents of this message is
> strictly prohibited. If you have received this message in error or are not
> the named recipient, please notify us immediately by contacting the sender
> at the electronic mail address noted above, and delete and destroy all
> copies of this message. Thank you.
>
> --------------------
>
> This electronic message is intended to be for the use only of the named
> recipient, and may contain information that is confidential or privileged.
> If you are not the intended recipient, you are hereby notified that any
> disclosure, copying, distribution or use of the contents of this message is
> strictly prohibited.  If you have received this message in error or are not
> the named recipient, please notify us immediately by contacting the sender
> at the electronic mail address noted above, and delete and destroy all
> copies of this message.  Thank you.
>
> --------------------
>
> This electronic message is intended to be for the use only of the named
> recipient, and may contain information that is confidential or privileged.
> If you are not the intended recipient, you are hereby notified that any
> disclosure, copying, distribution or use of the contents of this message is
> strictly prohibited.  If you have received this message in error or are not
> the named recipient, please notify us immediately by contacting the sender
> at the electronic mail address noted above, and delete and destroy all
> copies of this message.  Thank you.
>
>



-- 
Harsh J

Mime
View raw message