Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6F1C9FB24 for ; Sat, 23 Mar 2013 05:43:45 +0000 (UTC) Received: (qmail 44137 invoked by uid 500); 23 Mar 2013 05:43:40 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 43819 invoked by uid 500); 23 Mar 2013 05:43:38 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 43796 invoked by uid 99); 23 Mar 2013 05:43:38 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Mar 2013 05:43:38 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of harsh@cloudera.com designates 209.85.210.174 as permitted sender) Received: from [209.85.210.174] (HELO mail-ia0-f174.google.com) (209.85.210.174) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Mar 2013 05:43:33 +0000 Received: by mail-ia0-f174.google.com with SMTP id b35so4096886iac.5 for ; Fri, 22 Mar 2013 22:43:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type:content-transfer-encoding :x-gm-message-state; bh=sB+eefKI6KesyhwLVh3XgMTL7YRjlcPC68/cvTG4rys=; b=aGEMzLMILzKziDiK5iBZio8ylHyXOIdRt0LzcZnV34SiVKu7zOXzaIWPrwcmOCHF6z IcYIG6HBQgtNdKTpjZQuLYM+6s6Q2HZ/u6mFDEhwdYYFNSlFm95b9wMAuXiWRjlUDZSF GG6gTRoC4imcpBC4bgMmRMJk/VNmq9Sm4e08s4zY9P7H1X7E5OG+ob2IXnm5VGv0cdEy I/s8hjAaHfq/wZmSrbe4W/XrsFs4Mj2wcYKWR85I4I+CCnrOyPGEmsqosrYBHznJfJ/d R98X9G3Q4IciB+f0hhy8gc/55CiQGm07CjFk1LBpUwZxtHT4W+euFWGYT/369wA9Faxi ULVw== X-Received: by 10.50.78.202 with SMTP id d10mr60740igx.69.1364017393287; Fri, 22 Mar 2013 22:43:13 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.181.198 with HTTP; Fri, 22 Mar 2013 22:42:53 -0700 (PDT) In-Reply-To: References: From: Harsh J Date: Sat, 23 Mar 2013 11:12:53 +0530 Message-ID: Subject: Re: Cluster lost IP addresses To: "" , Chris Embree Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQmoAPUiIb4y64OJ+81ulfj6byfFDTHk2XRAL8i/BZyU4G93R9Wla4B3n0HLQNgZWwAZD+t3 X-Virus-Checked: Checked by ClamAV on apache.org Hi Chris, Where exactly are you seeing issues with change of NN/DN IPs? I've never encountered trouble on IP changes (I keep moving across networks everyday and the HDFS plus MR I run both stand tall without requiring a restart). We do not store (generally) nor rely on IP addresses. An exclusion may apply to files under construction I think, but a properly shutdown cluster pre-move would not have that and such files wouldn't matter too much in such a scenario anyway. Obviously, a hostname change could cause issues. FWIW, you can easily take any person's fsimage from across the world and start your NN on top of that and add in new DNs with the block data under them and setup the HDFS cluster. This is rather painless and well-built and goes to show that its not really IP dependent in any way. Please do elaborate. On Sat, Mar 23, 2013 at 10:52 AM, Chris Embree wrote: > Hey John, > > Make sure your /etc/hosts ( or DNS) is up to date and any topology scrip= ts > are updated. Unfortunately, NN is pretty dumb about IP's vs. Hostnames. > > BTW, NN devs. Seriously? You rely on IP addr instead of hostname? Some= one > should probably be shot or at least be responsible for fixing this > abomination. Sad that this code was released GA. > > Sorry folks. HDFS/Mapred is really cool tech, I'm just jaded about this > kind of silliness. > > In my Not So Humble Opinion. > Chris > > > On Sat, Mar 23, 2013 at 1:12 AM, Harsh J wrote: >> >> NameNode does not persist block locations; so this is still >> recoverable if the configs are changed to use the new set of hostnames >> to bind to/look up. >> >> On Sat, Mar 23, 2013 at 9:01 AM, Azuryy Yu wrote: >> > it has issues, namenode save blockid->nodes, using ip addr if your >> > slaves >> > config file using ip addr instead of hostname. >> > >> > On Mar 23, 2013 10:14 AM, "Balaji Narayanan (=E0=AE=AA=E0=AE=BE=E0=AE= =B2=E0=AE=BE=E0=AE=9C=E0=AE=BF =E0=AE=A8=E0=AE=BE=E0=AE=B0=E0=AE=BE=E0=AE= =AF=E0=AE=A3=E0=AE=A9=E0=AF=8D)" >> > wrote: >> >> >> >> Assuming you are using hostnAmes and not ip address in your config >> >> files >> >> What happens when you start the cluster? If you are using IP address = in >> >> your >> >> configs just update them and start. It should work with no issues. >> >> >> >> On Friday, March 22, 2013, John Meza wrote: >> >>> >> >>> I have a 18 node cluster that had to be physically moved. >> >>> Unfortunately all the ip addresses were lost (recreated). >> >>> >> >>> This must have happened to someone before. >> >>> Nothing else on the machines has been changed. Most importantly the >> >>> data >> >>> in HDFS is still sitting there. >> >>> >> >>> Is there a way to recover this cluster to a useable state? >> >>> thanks >> >>> John >> >> >> >> >> >> >> >> -- >> >> http://balajin.net/blog >> >> http://flic.kr/balajijegan >> >> >> >> -- >> Harsh J > > --=20 Harsh J