Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 456BAC4D7 for ; Tue, 10 Sep 2013 15:14:23 +0000 (UTC) Received: (qmail 56913 invoked by uid 500); 10 Sep 2013 15:14:18 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 56823 invoked by uid 500); 10 Sep 2013 15:14:17 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 56816 invoked by uid 99); 10 Sep 2013 15:14:17 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Sep 2013 15:14:17 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of vinay.opensource@gmail.com designates 209.85.217.177 as permitted sender) Received: from [209.85.217.177] (HELO mail-lb0-f177.google.com) (209.85.217.177) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Sep 2013 15:14:11 +0000 Received: by mail-lb0-f177.google.com with SMTP id p5so6300456lbi.36 for ; Tue, 10 Sep 2013 08:13:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=NHdZI9fv9T1htsXwjHwWJrVtLAZ5Ndv90CURWf3BFb8=; b=VtzMoA3uNxigQ8koie3RkllegoccD0fuAQTz6PkXPmk8CJhLSUF2q+jWikJ5/ST8y2 qjHjLdJndvuLL+qPUiT3/kslRdiWd3XHIdoHcytdokQ52Uhlw40lUVqyQBVJuTbCjI5T VVsQ7pXrp9h8/1sOQWAdwS5cbJsU+kWG05XyyP3uZHn0dCJfEd3rzpEo6zmsi26kkhaF pnbEJrs2uMYEybN2kGcbO2XuQyD8b6SrXHcilvIj9JXITdWX5JwFejYjb0hGYLKx/mxl NDEzzdIYQuCc2jB1Yk8t5qoG1tkzUh1EqC5lqu5rarLnM8jGI5nOoQYvXDVjrs6kW7wU bk5w== MIME-Version: 1.0 X-Received: by 10.152.21.10 with SMTP id r10mr51746lae.52.1378826030920; Tue, 10 Sep 2013 08:13:50 -0700 (PDT) Received: by 10.114.77.74 with HTTP; Tue, 10 Sep 2013 08:13:50 -0700 (PDT) Received: by 10.114.77.74 with HTTP; Tue, 10 Sep 2013 08:13:50 -0700 (PDT) In-Reply-To: References: Date: Tue, 10 Sep 2013 20:43:50 +0530 Message-ID: Subject: Re: hadoop cares about /etc/hosts ? From: Vinayakumar B To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=089e013d2030c7be7c04e608f334 X-Virus-Checked: Checked by ClamAV on apache.org --089e013d2030c7be7c04e608f334 Content-Type: text/plain; charset=ISO-8859-1 Ensure that for each ip there is only one hostname configured in /etc/hosts file. If you configure multiple different hostnames for same ip then os will chose first one when finding hostname using ip. Similarly for ip using hostname. Regards, Vinayakumar B On Sep 10, 2013 9:27 AM, "Chris Embree" wrote: > This sound entirely like an OS Level problem and is slightly outside of > the scope of this list, however. I'd suggest you look at your > /etc/nsswitch.conf file and ensure that the hosts: line says > hosts: files dns > > This will ensure that names are resolved first by /etc/hosts, then by DNS. > > Please also ensure that all of your systems have the same configuration > and that your NN, JT, SNN, etc. are all using the correct/same hostname. > > This is basic Name Resolution, please do not confuse this with a Hadoop > issue. IMHO > > > On Mon, Sep 9, 2013 at 10:05 PM, Cipher Chen wrote: > >> Sorry i didn't express it well. >> >> conf/masters: >> master >> >> conf/slaves: >> master >> slave >> >> The /etc/hosts file which caused the problem (start-dfs.sh failed): >> 127.0.0.1 localhost >> 192.168.6.10 localhost >> ### >> >> 192.168.6.10 tulip master >> 192.168.6.5 violet slave >> >> But when I commented the line appended with hash, >> 127.0.0.1 localhost >> # >> 192.168.6.10 localhost >> ### >> >> 192.168.6.10 tulip master >> 192.168.6.5 violet slave >> >> The namenode starts successfully. >> I can't figure out *why*. >> How does hadoop decide which host/hostname/ip to be the namenode? >> >> BTW: How could namenode care about conf/masters and conf/slaves, >> since it's the host who run start-dfs.sh would be the namenode. >> Namenode doesn't need to check those confs. >> Nodes listed in conf/masteres would be SecondaryNameNode, isn't it? >> I >> >> >> On Mon, Sep 9, 2013 at 10:39 PM, Jitendra Yadav < >> jeetuyadav200890@gmail.com> wrote: >> >>> Means your $HADOOP_HOME/conf/masters file content. >>> >>> >>> On Mon, Sep 9, 2013 at 7:52 PM, Jay Vyas wrote: >>> >>>> Jitendra: When you say " check your masters file content" what are >>>> you referring to? >>>> >>>> >>>> On Mon, Sep 9, 2013 at 8:31 AM, Jitendra Yadav < >>>> jeetuyadav200890@gmail.com> wrote: >>>> >>>>> Also can you please check your masters file content in hadoop conf >>>>> directory? >>>>> >>>>> Regards >>>>> JItendra >>>>> >>>>> On Mon, Sep 9, 2013 at 5:11 PM, Olivier Renault < >>>>> orenault@hortonworks.com> wrote: >>>>> >>>>>> Could you confirm that you put the hash in front of 192.168.6.10 >>>>>> localhost >>>>>> >>>>>> It should look like >>>>>> >>>>>> # 192.168.6.10 localhost >>>>>> >>>>>> Thanks >>>>>> Olivier >>>>>> On 9 Sep 2013 12:31, "Cipher Chen" >>>>>> wrote: >>>>>> >>>>>>> Hi everyone, >>>>>>> I have solved a configuration problem due to myself in hadoop >>>>>>> cluster mode. >>>>>>> >>>>>>> I have configuration as below: >>>>>>> >>>>>>> >>>>>>> fs.default.name >>>>>>> hdfs://master:54310 >>>>>>> >>>>>>> >>>>>>> a >>>>>>> nd the hosts file: >>>>>>> >>>>>>> >>>>>>> /etc/hosts: >>>>>>> 127.0.0.1 localhost >>>>>>> 192.168.6.10 localhost >>>>>>> ### >>>>>>> >>>>>>> 192.168.6.10 tulip master >>>>>>> 192.168.6.5 violet slave >>>>>>> >>>>>>> a >>>>>>> nd when i was trying to start-dfs.sh, namenode failed to start. >>>>>>> >>>>>>> >>>>>>> namenode log hinted that: >>>>>>> 13/09/09 17:09:02 INFO namenode.NameNode: Namenode up at: localhost/ >>>>>>> 192.168.6.10:54310 >>>>>>> ... >>>>>>> 13/09/09 17:09:10 INFO ipc.Client: Retrying connect to server: >>>>>>> localhost/127.0.0.1:54310. Already tried 0 time(s); retry policy is >>>>>>> RetryUpToMaximumCountWithF> >>>>>>> 13/09/09 17:09:11 INFO ipc.Client: Retrying connect to server: >>>>>>> localhost/127.0.0.1:54310. Already tried 1 time(s); retry policy is >>>>>>> RetryUpToMaximumCountWithF> >>>>>>> 13/09/09 17:09:12 INFO ipc.Client: Retrying connect to server: >>>>>>> localhost/127.0.0.1:54310. Already tried 2 time(s); retry policy is >>>>>>> RetryUpToMaximumCountWithF> >>>>>>> 13/09/09 17:09:13 INFO ipc.Client: Retrying connect to server: >>>>>>> localhost/127.0.0.1:54310. Already tried 3 time(s); retry policy is >>>>>>> RetryUpToMaximumCountWithF> >>>>>>> 13/09/09 17:09:14 INFO ipc.Client: Retrying connect to server: >>>>>>> localhost/127.0.0.1:54310. Already tried 4 time(s); retry policy is >>>>>>> RetryUpToMaximumCountWithF> >>>>>>> 13/09/09 17:09:15 INFO ipc.Client: Retrying connect to server: >>>>>>> localhost/127.0.0.1:54310. Already tried 5 time(s); retry policy is >>>>>>> RetryUpToMaximumCountWithF> >>>>>>> 13/09/09 17:09:16 INFO ipc.Client: Retrying connect to server: >>>>>>> localhost/127.0.0.1:54310. Already tried 6 time(s); retry policy is >>>>>>> RetryUpToMaximumCountWithF> >>>>>>> 13/09/09 17:09:17 INFO ipc.Client: Retrying connect to server: >>>>>>> localhost/127.0.0.1:54310. Already tried 7 time(s); retry policy is >>>>>>> RetryUpToMaximumCountWithF> >>>>>>> 13/09/09 17:09:18 INFO ipc.Client: Retrying connect to server: >>>>>>> localhost/127.0.0.1:54310. Already tried 8 time(s); retry policy is >>>>>>> RetryUpToMaximumCountWithF> >>>>>>> 13/09/09 17:09:19 INFO ipc.Client: Retrying connect to server: >>>>>>> localhost/127.0.0.1:54310. Already tried 9 time(s); retry policy is >>>>>>> RetryUpToMaximumCountWithF> >>>>>>> ... >>>>>>> >>>>>>> Now I know deleting the line "192.168.6.10 localhost ### >>>>>>> " >>>>>>> would fix this. >>>>>>> But I still don't know >>>>>>> >>>>>>> why hadoop would resolve "master" to "localhost/127.0.0.1." >>>>>>> >>>>>>> >>>>>>> Seems http://blog.devving.com/why-does-hbase-care-about-etchosts/explains this, >>>>>>> I'm not quite sure. >>>>>>> Is there any >>>>>>> other >>>>>>> explanation to this? >>>>>>> >>>>>>> Thanks. >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Cipher Chen >>>>>>> >>>>>> >>>>>> CONFIDENTIALITY NOTICE >>>>>> NOTICE: This message is intended for the use of the individual or >>>>>> entity to which it is addressed and may contain information that is >>>>>> confidential, privileged and exempt from disclosure under applicable law. >>>>>> If the reader of this message is not the intended recipient, you are hereby >>>>>> notified that any printing, copying, dissemination, distribution, >>>>>> disclosure or forwarding of this communication is strictly prohibited. If >>>>>> you have received this communication in error, please contact the sender >>>>>> immediately and delete it from your system. Thank You. >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Jay Vyas >>>> http://jayunit100.blogspot.com >>>> >>> >>> >> >> >> -- >> Cipher Chen >> > > --089e013d2030c7be7c04e608f334 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

Ensure that for each ip there is only one hostname configure= d in /etc/hosts file.

If you configure multiple different hostnames for same ip th= en os will chose first one when finding hostname using ip. Similarly for ip= using hostname.

Regards,
Vinayakumar B

On Sep 10, 2013 9:27 AM, "Chris Embree"= ; <cembree@gmail.com> wrote:=
This sound entirely like an OS Level problem and is slight= ly outside of the scope of this list, however. =A0I'd suggest you look = at your /etc/nsswitch.conf file and ensure that the hosts: line says=A0
= hosts: files dns

This will ensure that names are resolved first by /etc/hosts= , then by DNS.

Please also ensure that all of your= systems have the same configuration and that your NN, JT, SNN, etc. are al= l using the correct/same hostname.

This is basic Name Resolution, please do not confuse th= is with a Hadoop issue. IMHO


=
On Mon, Sep 9, 2013 at 10:05 PM, Cipher Chen <cipher.chen2012@gmail.com> wrote:
Sorry i didn't expre= ss it well.

conf/masters:
master

conf/slaves:
master
slave

The /etc/hosts file which caused t= he problem (start-dfs.sh failed):
127.0.0.1=A0=A0=A0=A0=A0=A0 localhost
192.168.6.10=A0=A0=A0 local= host
###

192.168.6.10=A0=A0=A0 tulip master
192.168.6.5=A0=A0=A0=A0 = violet slave

But when I commented the line appended w= ith hash,
127.0.0.1=A0=A0=A0=A0=A0=A0 localhost
#=A0
192.168.6.10=A0=A0=A0 local= host
###

192.168.6.10=A0=A0=A0 tulip master
192.168.6.5=A0=A0=A0=A0 = violet slave

The namenode starts successfully.
I can't figure out why.
How does hadoop decide whic= h host/hostname/ip to be the namenode?

BTW: How could namenode care about conf/masters and conf/slave= s,
since it's the host who run start-dfs.sh would be the namenode.<= br>
Namenode doesn't need to check those confs.
Nodes = listed in conf/masteres would be SecondaryNameNode, isn't it?
I


On Mon, Sep 9, 2013 at 10:39 PM, Jitendra Yadav <jee= tuyadav200890@gmail.com> wrote:
Means your $HADOOP_HOME/conf/masters file co= ntent.


On Mon, Sep 9, 2013 at 7:52 PM, Jay Vyas <ja= yunit100@gmail.com> wrote:
Jitendra: =A0When you say " check your masters file c= ontent" =A0what are you referring to?


On Mon, Sep 9, 2013 at 8:31 AM, Jitendra Yadav <= span dir=3D"ltr"><jeetuyadav200890@gmail.com> wrote:
Also can you please check your masters file content in hadoop conf dir= ectory?
=A0
Regards
JItendra=20

On Mon, Sep 9, 2013 at 5:11 PM, Olivier Renault = <orenault@= hortonworks.com> wrote:

Could you confirm that you put the hash in front of 192.168.= 6.10=A0=A0=A0 localhost

It should look like

# 192.168.6.10=A0=A0=A0 localhost

Thanks
Olivier

On 9 Sep 2013 12:31, "Cipher Chen" <= ;cipher.chen= 2012@gmail.com> wrote:
Hi everyone,
=A0 I have solved a configuration problem due to myself in hadoop cluste= r mode.

I have configuration as below:

=A0 <property>
=A0= =A0=A0 <name>fs= .default.name</name>
=A0=A0=A0 <value>hdfs://master:54310</value>
=A0 </proper= ty>

a=20
nd the hosts file:


/etc/hosts:
127.0.0.1= =A0=A0=A0=A0=A0=A0 localhost
192.168.6.10=A0=A0=A0 localhost=20
###

192.168.6.10=A0=A0=A0 tulip master
192.1= 68.6.5=A0=A0=A0=A0 violet slave

a=20
nd when i was trying to start-dfs.sh, namenode failed to= start.


namenode log hinted that:
13/09/09 17:09:02 INFO name= node.NameNode: Namenode up at: localhost/192.168.6.10:54310
...
13/09/09 17:09:10 INFO ipc.Client: Retrying connect to server: local= host/127.0.0.1:54310<= /a>. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithF>= ;
13/09/09 17:09:11 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310
. Al= ready tried 1 time(s); retry policy is RetryUpToMaximumCountWithF>
13= /09/09 17:09:12 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Alre= ady tried 2 time(s); retry policy is RetryUpToMaximumCountWithF>
13/09/09 17:09:13 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Al= ready tried 3 time(s); retry policy is RetryUpToMaximumCountWithF>
13= /09/09 17:09:14 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Alre= ady tried 4 time(s); retry policy is RetryUpToMaximumCountWithF>
13/09/09 17:09:15 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Al= ready tried 5 time(s); retry policy is RetryUpToMaximumCountWithF>
13= /09/09 17:09:16 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Alre= ady tried 6 time(s); retry policy is RetryUpToMaximumCountWithF>
13/09/09 17:09:17 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Al= ready tried 7 time(s); retry policy is RetryUpToMaximumCountWithF>
13= /09/09 17:09:18 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Alre= ady tried 8 time(s); retry policy is RetryUpToMaximumCountWithF>
13/09/09 17:09:19 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:54310. Al= ready tried 9 time(s); retry policy is RetryUpToMaximumCountWithF>
..= .

Now I know deleting the line "192.168.6.10=A0=A0=A0= localhost=A0 ###
"=20
would fix this.
But I still don't know
=A0=20
why hadoop would resolve "master" to "loc= alhost/127.0.0.1."=

I'm not quite sure.
Is there any=20
=A0other
explanation to this?

Thanks.


--
Cipher Chen
=
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or e= ntity to which it is addressed and may contain information that is confiden= tial, privileged and exempt from disclosure under applicable law. If the re= ader of this message is not the intended recipient, you are hereby notified= that any printing, copying, dissemination, distribution, disclosure or for= warding of this communication is strictly prohibited. If you have received = this communication in error, please contact the sender immediately and dele= te it from your system. Thank You.




--
Jay Vyas
http://jayunit100.blogspot.com
<= /span>

=


--
Cipher Chen

--089e013d2030c7be7c04e608f334--