hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cipher Chen <cipher.chen2...@gmail.com>
Subject Re: hadoop cares about /etc/hosts ?
Date Thu, 12 Sep 2013 01:41:47 GMT
Hi, all
  Thanks for all your replies and guidance.

  Although I haven't figured out why. :)


On Wed, Sep 11, 2013 at 4:03 PM, Jitendra Yadav
<jeetuyadav200890@gmail.com>wrote:

> Hi,
>
> So what you were expecting while pinging master?
>
> As per my understanding it is working fine.Well there is no sense of using
> localhost and hostname on same ip, for localhost it's always preferred to
> use loopback method i.e 127.0.0.1
>
> Hope this will help you.
>
> Regards
> Jitendra
> On Wed, Sep 11, 2013 at 7:05 AM, Cipher Chen <cipher.chen2012@gmail.com>wrote:
>
>>  So
>> for the first *wrong* /etc/hosts file, the sequence would be :
>> find hdfs://master:54310
>> find master -> 192.168.6.10 (*but it already got ip here*)
>> find 192.168.6.10 -> localhost
>> find localhost -> 127.0.0.1
>>
>>
>> The other thing, when 'ping master', i would got reply from
>> '192.168.6.10' instead of 127.0.0.1.
>> So it's not simply the Name Resolution on the os level. Or i'm totally
>> wrong?
>>
>>
>>
>> On Tue, Sep 10, 2013 at 11:13 PM, Vinayakumar B <
>> vinay.opensource@gmail.com> wrote:
>>
>>> Ensure that for each ip there is only one hostname configured in
>>> /etc/hosts file.
>>>
>>> If you configure multiple different hostnames for same ip then os will
>>> chose first one when finding hostname using ip. Similarly for ip using
>>> hostname.
>>>
>>> Regards,
>>> Vinayakumar B
>>>  On Sep 10, 2013 9:27 AM, "Chris Embree" <cembree@gmail.com> wrote:
>>>
>>>> This sound entirely like an OS Level problem and is slightly outside of
>>>> the scope of this list, however.  I'd suggest you look at your
>>>> /etc/nsswitch.conf file and ensure that the hosts: line says
>>>> hosts: files dns
>>>>
>>>> This will ensure that names are resolved first by /etc/hosts, then by
>>>> DNS.
>>>>
>>>> Please also ensure that all of your systems have the same configuration
>>>> and that your NN, JT, SNN, etc. are all using the correct/same hostname.
>>>>
>>>> This is basic Name Resolution, please do not confuse this with a Hadoop
>>>> issue. IMHO
>>>>
>>>>
>>>> On Mon, Sep 9, 2013 at 10:05 PM, Cipher Chen <cipher.chen2012@gmail.com
>>>> > wrote:
>>>>
>>>>>  Sorry i didn't express it well.
>>>>>
>>>>> conf/masters:
>>>>> master
>>>>>
>>>>> conf/slaves:
>>>>> master
>>>>> slave
>>>>>
>>>>> The /etc/hosts file which caused the problem (start-dfs.sh failed):
>>>>>  127.0.0.1       localhost
>>>>>  192.168.6.10    localhost
>>>>> ###
>>>>>
>>>>> 192.168.6.10    tulip master
>>>>> 192.168.6.5     violet slave
>>>>>
>>>>> But when I commented the line appended with hash,
>>>>> 127.0.0.1       localhost
>>>>> #
>>>>> 192.168.6.10    localhost
>>>>> ###
>>>>>
>>>>> 192.168.6.10    tulip master
>>>>> 192.168.6.5     violet slave
>>>>>
>>>>> The namenode starts successfully.
>>>>> I can't figure out *why*.
>>>>> How does hadoop decide which host/hostname/ip to be the namenode?
>>>>>
>>>>> BTW: How could namenode care about conf/masters and conf/slaves,
>>>>> since it's the host who run start-dfs.sh would be the namenode.
>>>>> Namenode doesn't need to check those confs.
>>>>> Nodes listed in conf/masteres would be SecondaryNameNode, isn't it?
>>>>> I
>>>>>
>>>>>
>>>>> On Mon, Sep 9, 2013 at 10:39 PM, Jitendra Yadav <
>>>>> jeetuyadav200890@gmail.com> wrote:
>>>>>
>>>>>> Means your $HADOOP_HOME/conf/masters file content.
>>>>>>
>>>>>>
>>>>>> On Mon, Sep 9, 2013 at 7:52 PM, Jay Vyas <jayunit100@gmail.com>wrote:
>>>>>>
>>>>>>> Jitendra:  When you say " check your masters file content"  what
are
>>>>>>> you referring to?
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Sep 9, 2013 at 8:31 AM, Jitendra Yadav <
>>>>>>> jeetuyadav200890@gmail.com> wrote:
>>>>>>>
>>>>>>>> Also can you please check your masters file content in hadoop
conf
>>>>>>>> directory?
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> JItendra
>>>>>>>>
>>>>>>>> On Mon, Sep 9, 2013 at 5:11 PM, Olivier Renault <
>>>>>>>> orenault@hortonworks.com> wrote:
>>>>>>>>
>>>>>>>>> Could you confirm that you put the hash in front of
>>>>>>>>> 192.168.6.10    localhost
>>>>>>>>>
>>>>>>>>> It should look like
>>>>>>>>>
>>>>>>>>> # 192.168.6.10    localhost
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Olivier
>>>>>>>>>  On 9 Sep 2013 12:31, "Cipher Chen" <cipher.chen2012@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>   Hi everyone,
>>>>>>>>>>   I have solved a configuration problem due to myself
in hadoop
>>>>>>>>>> cluster mode.
>>>>>>>>>>
>>>>>>>>>> I have configuration as below:
>>>>>>>>>>
>>>>>>>>>>   <property>
>>>>>>>>>>     <name>fs.default.name</name>
>>>>>>>>>>     <value>hdfs://master:54310</value>
>>>>>>>>>>   </property>
>>>>>>>>>>
>>>>>>>>>> a
>>>>>>>>>> nd the hosts file:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> /etc/hosts:
>>>>>>>>>> 127.0.0.1       localhost
>>>>>>>>>>  192.168.6.10    localhost
>>>>>>>>>> ###
>>>>>>>>>>
>>>>>>>>>> 192.168.6.10    tulip master
>>>>>>>>>> 192.168.6.5     violet slave
>>>>>>>>>>
>>>>>>>>>> a
>>>>>>>>>> nd when i was trying to start-dfs.sh, namenode failed
to start.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> namenode log hinted that:
>>>>>>>>>> 13/09/09 17:09:02 INFO namenode.NameNode: Namenode
up at:
>>>>>>>>>> localhost/192.168.6.10:54310
>>>>>>>>>> ...
>>>>>>>>>> 13/09/09 17:09:10 INFO ipc.Client: Retrying connect
to server:
>>>>>>>>>> localhost/127.0.0.1:54310. Already tried 0 time(s);
retry policy
>>>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>>>> 13/09/09 17:09:11 INFO ipc.Client: Retrying connect
to server:
>>>>>>>>>> localhost/127.0.0.1:54310. Already tried 1 time(s);
retry policy
>>>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>>>> 13/09/09 17:09:12 INFO ipc.Client: Retrying connect
to server:
>>>>>>>>>> localhost/127.0.0.1:54310. Already tried 2 time(s);
retry policy
>>>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>>>> 13/09/09 17:09:13 INFO ipc.Client: Retrying connect
to server:
>>>>>>>>>> localhost/127.0.0.1:54310. Already tried 3 time(s);
retry policy
>>>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>>>> 13/09/09 17:09:14 INFO ipc.Client: Retrying connect
to server:
>>>>>>>>>> localhost/127.0.0.1:54310. Already tried 4 time(s);
retry policy
>>>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>>>> 13/09/09 17:09:15 INFO ipc.Client: Retrying connect
to server:
>>>>>>>>>> localhost/127.0.0.1:54310. Already tried 5 time(s);
retry policy
>>>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>>>> 13/09/09 17:09:16 INFO ipc.Client: Retrying connect
to server:
>>>>>>>>>> localhost/127.0.0.1:54310. Already tried 6 time(s);
retry policy
>>>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>>>> 13/09/09 17:09:17 INFO ipc.Client: Retrying connect
to server:
>>>>>>>>>> localhost/127.0.0.1:54310. Already tried 7 time(s);
retry policy
>>>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>>>> 13/09/09 17:09:18 INFO ipc.Client: Retrying connect
to server:
>>>>>>>>>> localhost/127.0.0.1:54310. Already tried 8 time(s);
retry policy
>>>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>>>> 13/09/09 17:09:19 INFO ipc.Client: Retrying connect
to server:
>>>>>>>>>> localhost/127.0.0.1:54310. Already tried 9 time(s);
retry policy
>>>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>>>> ...
>>>>>>>>>>
>>>>>>>>>> Now I know deleting the line "192.168.6.10    localhost
 ###
>>>>>>>>>> "
>>>>>>>>>> would fix this.
>>>>>>>>>> But I still don't know
>>>>>>>>>>
>>>>>>>>>> why hadoop would resolve "master" to "localhost/127.0.0.1."
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Seems http://blog.devving.com/why-does-hbase-care-about-etchosts/explains
this,
>>>>>>>>>>  I'm not quite sure.
>>>>>>>>>> Is there any
>>>>>>>>>>  other
>>>>>>>>>> explanation to this?
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  --
>>>>>>>>>> Cipher Chen
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>>>> NOTICE: This message is intended for the use of the individual
or
>>>>>>>>> entity to which it is addressed and may contain information
that is
>>>>>>>>> confidential, privileged and exempt from disclosure under
applicable law.
>>>>>>>>> If the reader of this message is not the intended recipient,
you are hereby
>>>>>>>>> notified that any printing, copying, dissemination, distribution,
>>>>>>>>> disclosure or forwarding of this communication is strictly
prohibited. If
>>>>>>>>> you have received this communication in error, please
contact the sender
>>>>>>>>> immediately and delete it from your system. Thank You.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Jay Vyas
>>>>>>> http://jayunit100.blogspot.com
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Cipher Chen
>>>>>
>>>>
>>>>
>>
>>
>> --
>> Cipher Chen
>>
>
>


-- 
Cipher Chen

Mime
View raw message