Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6BBF5DE24 for ; Wed, 19 Sep 2012 03:38:05 +0000 (UTC) Received: (qmail 68643 invoked by uid 500); 19 Sep 2012 03:38:00 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 68353 invoked by uid 500); 19 Sep 2012 03:37:59 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 68338 invoked by uid 99); 19 Sep 2012 03:37:59 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Sep 2012 03:37:59 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FSL_RCVD_USER,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of harsh@cloudera.com designates 209.85.219.48 as permitted sender) Received: from [209.85.219.48] (HELO mail-oa0-f48.google.com) (209.85.219.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Sep 2012 03:37:54 +0000 Received: by oagn16 with SMTP id n16so817703oag.35 for ; Tue, 18 Sep 2012 20:37:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding:x-gm-message-state; bh=gt9O1c0xUzwkZ4EOiXAqknJZWC6li7/2G6fIQOa9y3c=; b=ZUVRMXxp7xuK7Hw0TDEAzF71IzQnAgb4gg+T03ZGziaJExbOsmXy2pTgUT7mg2j4Df Q53gZ/VWGWlEyDll4XwWzU2JZuDx9LC+drS1M+ilDwWnZPUn+0zWYMcBzzS2jCn2WfnQ V4Ph8CpFdXx80G5CzspBJoMnvXg3wQmZlSblkFI/0kkDM/kgJW8X4qIegBYqENkjo6rT SUqjzC+gpjf2Wy3pq2FBV4ArDLv6+YCb1qH0ZoDfeAcR0Du0NgA7qp4LFA9yescO9+P2 eCDZkTgw4kQuJLuXzoSUAMnEAPBQCSrR/d18VJ7YFsKmCZGO0SM4wtR2u88QKJ4vEG4M H5LA== Received: by 10.60.3.199 with SMTP id e7mr2000509oee.59.1348025853502; Tue, 18 Sep 2012 20:37:33 -0700 (PDT) MIME-Version: 1.0 Received: by 10.76.11.168 with HTTP; Tue, 18 Sep 2012 20:37:13 -0700 (PDT) In-Reply-To: <26603_1347987549_0MAK008FU1UKRZ00_99DD75DC8938B743BBBC2CA54F7224A70228A8@NYSGMBXB06.a.wcmc-ad.net> References: <31454_1347917912_0MAI002J2K47JJ50_2DD71E5EB378B1499ACC1F1E91A389390161E73867@EXCHCCR1.a.wcmc-ad.net> <50579AB8.7020501@syndicate.net> <26619_1347976634_0MAJ00DBYTFCPL70_99DD75DC8938B743BBBC2CA54F7224A7021CA0@NYSGMBXB06.a.wcmc-ad.net> <26603_1347987549_0MAK008FU1UKRZ00_99DD75DC8938B743BBBC2CA54F7224A70228A8@NYSGMBXB06.a.wcmc-ad.net> From: Harsh J Date: Wed, 19 Sep 2012 09:07:13 +0530 Message-ID: Subject: Re: Hadoop recovery test To: user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQl0gEKmXpMJfvVJulbtW7GMTFcFnxx7fnSeyt2byv+8rMOitSBbEFbuEkovKY4t36m8/PNF X-Virus-Checked: Checked by ClamAV on apache.org Artem, If you check the logs of the other DNs, do you see issues with connectivity to NameNode? Basic questions, but need to ask to be sure: have you checked if the firewalls are down or properly configured? Are you sure that your hostname of the master machine resolves not to the loopback address but to the external interface provided IP? On Tue, Sep 18, 2012 at 10:29 PM, Artem Ervits wrote: > I didn't realize that I didn't edit core-site and mapred-site on all mach= ines to point to the new namenode. Although that didn't make a difference, = I still see only one datanode which Is also the namenode: > > Datanodes available: 1 (1 total, 0 dead) > > Name: 127.0.0.1:50010 > Decommission Status : Normal > Configured Capacity: 105425190912 (98.18 GB) > DFS Used: 1058557952 (1009.52 MB) > Non DFS Used: 200396800 (191.11 MB) > DFS Remaining: 104166236160(97.01 GB) > DFS Used%: 1% > DFS Remaining%: 98.81% > Last contact: Tue Sep 18 12:58:07 EDT 2012 > > The other strange thing is that it points to local 127.0.0.1 rather than = namenode's IP. > > -----Original Message----- > From: Artem Ervits [mailto:are9004@nyp.org] > Sent: Tuesday, September 18, 2012 9:57 AM > To: user@hadoop.apache.org > Cc: James Brown > Subject: RE: Hadoop recovery test > > No it only sees itself. It doesn't see the rest of the nodes. > > -----Original Message----- > From: James Brown [mailto:jb.01@syndicate.net] > Sent: Monday, September 17, 2012 5:49 PM > To: user@hadoop.apache.org > Subject: Re: Hadoop recovery test > > Does the new NameNode server see all of the DataNodes? > > On 9/17/2012 2:38 PM, Artem Ervits wrote: >> Hello all, >> >> I am testing the Hadoop recovery as per >> http://wiki.apache.org/hadoop/NameNode document. But instead of using >> an NFS share, I am copying to another directory. Then when I shut down >> the cluster, I scp that directory to another server and start Hadoop >> cluster using that machine as the namenode. I see in the log that some >> blocks are corrupt and/or missing. Do I have to wait for replication >> to recover all blocks or am I doing something else altogether? I am >> using Hadoop 1.0.3. Can someone point me to a more detailed document >> than the wiki in case I'm doing something wrong. >> >> p.s. if I restart the cluster using the original namenode, filesystem >> reports as healthy. >> >> Thank you. >> >> . >> >> /hdfs/hadoop/tmp/mapred/system/jobtracker.info: CORRUPT block >> blk_9043419219670949307 >> >> /hdfs/hadoop/tmp/mapred/system/jobtracker.info: MISSING 1 blocks of >> total size 4 B... >> >> /user/hduser/teragen/_logs/history/job_201209120941_0002_1347458152167_h= duser_TeraGen: >> Under replicated blk_-976282286234272458_1079. Target Replicas is 3 >> but found 1 replica(s). >> >> . >> >> /user/hduser/teragen/_logs/history/job_201209120941_0002_conf.xml: >> Under replicated blk_137658109390447967_1075. Target Replicas is 3 but >> found 1 replica(s). >> >> . >> >> /user/hduser/teragen/_partition.lst: Under replicated >> blk_-3005280481530403302_1080. Target Replicas is 3 but found 1 replica(= s). >> >> . >> >> /user/hduser/teragen/part-00000: Under replicated >> blk_-7008813028808832816_1077. Target Replicas is 3 but found 1 replica(= s). >> >> . >> >> /user/hduser/teragen/part-00001: Under replicated >> blk_-5256967771026054061_1078. Target Replicas is 3 but found 1 replica(= s). >> >> .. >> >> /user/hduser/teragen-out/_logs/history/job_201209120941_0003_13474582499= 20_hduser_TeraSort: >> Under replicated blk_1137779303840586677_1089. Target Replicas is 3 >> but found 1 replica(s). >> >> . >> >> /user/hduser/teragen-out/_logs/history/job_201209120941_0003_conf.xml: >> Under replicated blk_7701720691642589882_1086. Target Replicas is 3 >> but found 1 replica(s). >> >> . >> >> /user/hduser/teragen-out/part-00000: CORRUPT block >> blk_8059469267617478950 >> >> /user/hduser/teragen-out/part-00000: MISSING 1 blocks of total size >> 1000000 B... >> >> /user/hduser/teragen-validate/_logs/history/job_201209120941_0004_134745= 8495941_hduser_TeraValidate: >> Under replicated blk_5680565744062298575_1098. Target Replicas is 3 >> but found 1 replica(s). >> >> . >> >> /user/hduser/teragen-validate/_logs/history/job_201209120941_0004_conf.x= ml: >> Under replicated blk_1566253937037013126_1095. Target Replicas is 3 >> but found 1 replica(s). >> >> .Status: CORRUPT >> >> Total size: 1050720258 B >> >> Total dirs: 39 >> >> Total files: 32 >> >> Total blocks (validated): 42 (avg. block size 25017149 B) >> >> ******************************** >> >> CORRUPT FILES: 2 >> >> MISSING BLOCKS: 2 >> >> MISSING SIZE: 1000004 B >> >> CORRUPT BLOCKS: 2 >> >> ******************************** >> >> Minimally replicated blocks: 40 (95.2381 %) >> >> Over-replicated blocks: 0 (0.0 %) >> >> Under-replicated blocks: 40 (95.2381 %) >> >> Mis-replicated blocks: 0 (0.0 %) >> >> Default replication factor: 3 >> >> Average block replication: 0.95238096 >> >> Corrupt blocks: 2 >> >> Missing replicas: 80 (200.0 %) >> >> Number of data-nodes: 1 >> >> Number of racks: 1 >> >> FSCK ended at Mon Sep 17 17:29:08 EDT 2012 in 21 milliseconds >> >> The filesystem under path '/' is CORRUPT >> >> Artem Ervits >> >> Data Analyst >> >> New York Presbyterian Hospital >> >> >> ---------------------------------------------------------------------- >> -- This electronic message is intended to be for the use only of the >> named recipient, and may contain information that is confidential or >> privileged. If you are not the intended recipient, you are hereby >> notified that any disclosure, copying, distribution or use of the >> contents of this message is strictly prohibited. If you have received >> this message in error or are not the named recipient, please notify us >> immediately by contacting the sender at the electronic mail address >> noted above, and delete and destroy all copies of this message. Thank yo= u. >> >> -------------------- >> >> This electronic message is intended to be for the use only of the named = recipient, and may contain information that is confidential or privileged. = If you are not the intended recipient, you are hereby notified that any di= sclosure, copying, distribution or use of the contents of this message is s= trictly prohibited. If you have received this message in error or are not = the named recipient, please notify us immediately by contacting the sender = at the electronic mail address noted above, and delete and destroy all copi= es of this message. Thank you. >> >> -------------------- >> >> This electronic message is intended to be for the use only of the named = recipient, and may contain information that is confidential or privileged. = If you are not the intended recipient, you are hereby notified that any di= sclosure, copying, distribution or use of the contents of this message is s= trictly prohibited. If you have received this message in error or are not = the named recipient, please notify us immediately by contacting the sender = at the electronic mail address noted above, and delete and destroy all copi= es of this message. Thank you. >> >> > > > > -------------------- > > This electronic message is intended to be for the use only of the named r= ecipient, and may contain information that is confidential or privileged. = If you are not the intended recipient, you are hereby notified that any dis= closure, copying, distribution or use of the contents of this message is st= rictly prohibited. If you have received this message in error or are not t= he named recipient, please notify us immediately by contacting the sender a= t the electronic mail address noted above, and delete and destroy all copie= s of this message. Thank you. > > > > > -------------------- > > This electronic message is intended to be for the use only of the named r= ecipient, and may contain information that is confidential or privileged. = If you are not the intended recipient, you are hereby notified that any dis= closure, copying, distribution or use of the contents of this message is st= rictly prohibited. If you have received this message in error or are not t= he named recipient, please notify us immediately by contacting the sender a= t the electronic mail address noted above, and delete and destroy all copie= s of this message. Thank you. > > > > > ________________________________ > > Confidential Information subject to NYP's (and its affiliates') informati= on management and security policies (http://infonet.nyp.org/QA/HospitalManu= al). > > > -------------------- > > This electronic message is intended to be for the use only of the named r= ecipient, and may contain information that is confidential or privileged. = If you are not the intended recipient, you are hereby notified that any dis= closure, copying, distribution or use of the contents of this message is st= rictly prohibited. If you have received this message in error or are not t= he named recipient, please notify us immediately by contacting the sender a= t the electronic mail address noted above, and delete and destroy all copie= s of this message. Thank you. > > > > > -------------------- > > This electronic message is intended to be for the use only of the named r= ecipient, and may contain information that is confidential or privileged. = If you are not the intended recipient, you are hereby notified that any dis= closure, copying, distribution or use of the contents of this message is st= rictly prohibited. If you have received this message in error or are not t= he named recipient, please notify us immediately by contacting the sender a= t the electronic mail address noted above, and delete and destroy all copie= s of this message. Thank you. > > > --=20 Harsh J