hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cipher Chen <cipher.chen2...@gmail.com>
Subject Re: hadoop cares about /etc/hosts ?
Date Wed, 11 Sep 2013 01:35:35 GMT
So
for the first *wrong* /etc/hosts file, the sequence would be :
find hdfs://master:54310
find master -> 192.168.6.10 (*but it already got ip here*)
find 192.168.6.10 -> localhost
find localhost -> 127.0.0.1


The other thing, when 'ping master', i would got reply from '192.168.6.10'
instead of 127.0.0.1.
So it's not simply the Name Resolution on the os level. Or i'm totally
wrong?



On Tue, Sep 10, 2013 at 11:13 PM, Vinayakumar B
<vinay.opensource@gmail.com>wrote:

> Ensure that for each ip there is only one hostname configured in
> /etc/hosts file.
>
> If you configure multiple different hostnames for same ip then os will
> chose first one when finding hostname using ip. Similarly for ip using
> hostname.
>
> Regards,
> Vinayakumar B
> On Sep 10, 2013 9:27 AM, "Chris Embree" <cembree@gmail.com> wrote:
>
>> This sound entirely like an OS Level problem and is slightly outside of
>> the scope of this list, however.  I'd suggest you look at your
>> /etc/nsswitch.conf file and ensure that the hosts: line says
>> hosts: files dns
>>
>> This will ensure that names are resolved first by /etc/hosts, then by DNS.
>>
>> Please also ensure that all of your systems have the same configuration
>> and that your NN, JT, SNN, etc. are all using the correct/same hostname.
>>
>> This is basic Name Resolution, please do not confuse this with a Hadoop
>> issue. IMHO
>>
>>
>> On Mon, Sep 9, 2013 at 10:05 PM, Cipher Chen <cipher.chen2012@gmail.com>wrote:
>>
>>> Sorry i didn't express it well.
>>>
>>> conf/masters:
>>> master
>>>
>>> conf/slaves:
>>> master
>>> slave
>>>
>>> The /etc/hosts file which caused the problem (start-dfs.sh failed):
>>> 127.0.0.1       localhost
>>> 192.168.6.10    localhost
>>> ###
>>>
>>> 192.168.6.10    tulip master
>>> 192.168.6.5     violet slave
>>>
>>> But when I commented the line appended with hash,
>>> 127.0.0.1       localhost
>>> #
>>> 192.168.6.10    localhost
>>> ###
>>>
>>> 192.168.6.10    tulip master
>>> 192.168.6.5     violet slave
>>>
>>> The namenode starts successfully.
>>> I can't figure out *why*.
>>> How does hadoop decide which host/hostname/ip to be the namenode?
>>>
>>> BTW: How could namenode care about conf/masters and conf/slaves,
>>> since it's the host who run start-dfs.sh would be the namenode.
>>> Namenode doesn't need to check those confs.
>>> Nodes listed in conf/masteres would be SecondaryNameNode, isn't it?
>>> I
>>>
>>>
>>> On Mon, Sep 9, 2013 at 10:39 PM, Jitendra Yadav <
>>> jeetuyadav200890@gmail.com> wrote:
>>>
>>>> Means your $HADOOP_HOME/conf/masters file content.
>>>>
>>>>
>>>> On Mon, Sep 9, 2013 at 7:52 PM, Jay Vyas <jayunit100@gmail.com> wrote:
>>>>
>>>>> Jitendra:  When you say " check your masters file content"  what are
>>>>> you referring to?
>>>>>
>>>>>
>>>>> On Mon, Sep 9, 2013 at 8:31 AM, Jitendra Yadav <
>>>>> jeetuyadav200890@gmail.com> wrote:
>>>>>
>>>>>> Also can you please check your masters file content in hadoop conf
>>>>>> directory?
>>>>>>
>>>>>> Regards
>>>>>> JItendra
>>>>>>
>>>>>> On Mon, Sep 9, 2013 at 5:11 PM, Olivier Renault <
>>>>>> orenault@hortonworks.com> wrote:
>>>>>>
>>>>>>> Could you confirm that you put the hash in front of 192.168.6.10
>>>>>>> localhost
>>>>>>>
>>>>>>> It should look like
>>>>>>>
>>>>>>> # 192.168.6.10    localhost
>>>>>>>
>>>>>>> Thanks
>>>>>>> Olivier
>>>>>>>  On 9 Sep 2013 12:31, "Cipher Chen" <cipher.chen2012@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>   Hi everyone,
>>>>>>>>   I have solved a configuration problem due to myself in
hadoop
>>>>>>>> cluster mode.
>>>>>>>>
>>>>>>>> I have configuration as below:
>>>>>>>>
>>>>>>>>   <property>
>>>>>>>>     <name>fs.default.name</name>
>>>>>>>>     <value>hdfs://master:54310</value>
>>>>>>>>   </property>
>>>>>>>>
>>>>>>>> a
>>>>>>>> nd the hosts file:
>>>>>>>>
>>>>>>>>
>>>>>>>> /etc/hosts:
>>>>>>>> 127.0.0.1       localhost
>>>>>>>>  192.168.6.10    localhost
>>>>>>>> ###
>>>>>>>>
>>>>>>>> 192.168.6.10    tulip master
>>>>>>>> 192.168.6.5     violet slave
>>>>>>>>
>>>>>>>> a
>>>>>>>> nd when i was trying to start-dfs.sh, namenode failed to
start.
>>>>>>>>
>>>>>>>>
>>>>>>>> namenode log hinted that:
>>>>>>>> 13/09/09 17:09:02 INFO namenode.NameNode: Namenode up at:
localhost/
>>>>>>>> 192.168.6.10:54310
>>>>>>>> ...
>>>>>>>> 13/09/09 17:09:10 INFO ipc.Client: Retrying connect to server:
>>>>>>>> localhost/127.0.0.1:54310. Already tried 0 time(s); retry
policy
>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>> 13/09/09 17:09:11 INFO ipc.Client: Retrying connect to server:
>>>>>>>> localhost/127.0.0.1:54310. Already tried 1 time(s); retry
policy
>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>> 13/09/09 17:09:12 INFO ipc.Client: Retrying connect to server:
>>>>>>>> localhost/127.0.0.1:54310. Already tried 2 time(s); retry
policy
>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>> 13/09/09 17:09:13 INFO ipc.Client: Retrying connect to server:
>>>>>>>> localhost/127.0.0.1:54310. Already tried 3 time(s); retry
policy
>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>> 13/09/09 17:09:14 INFO ipc.Client: Retrying connect to server:
>>>>>>>> localhost/127.0.0.1:54310. Already tried 4 time(s); retry
policy
>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>> 13/09/09 17:09:15 INFO ipc.Client: Retrying connect to server:
>>>>>>>> localhost/127.0.0.1:54310. Already tried 5 time(s); retry
policy
>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>> 13/09/09 17:09:16 INFO ipc.Client: Retrying connect to server:
>>>>>>>> localhost/127.0.0.1:54310. Already tried 6 time(s); retry
policy
>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>> 13/09/09 17:09:17 INFO ipc.Client: Retrying connect to server:
>>>>>>>> localhost/127.0.0.1:54310. Already tried 7 time(s); retry
policy
>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>> 13/09/09 17:09:18 INFO ipc.Client: Retrying connect to server:
>>>>>>>> localhost/127.0.0.1:54310. Already tried 8 time(s); retry
policy
>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>> 13/09/09 17:09:19 INFO ipc.Client: Retrying connect to server:
>>>>>>>> localhost/127.0.0.1:54310. Already tried 9 time(s); retry
policy
>>>>>>>> is RetryUpToMaximumCountWithF>
>>>>>>>> ...
>>>>>>>>
>>>>>>>> Now I know deleting the line "192.168.6.10    localhost 
###
>>>>>>>> "
>>>>>>>> would fix this.
>>>>>>>> But I still don't know
>>>>>>>>
>>>>>>>> why hadoop would resolve "master" to "localhost/127.0.0.1."
>>>>>>>>
>>>>>>>>
>>>>>>>> Seems http://blog.devving.com/why-does-hbase-care-about-etchosts/explains
this,
>>>>>>>>  I'm not quite sure.
>>>>>>>> Is there any
>>>>>>>>  other
>>>>>>>> explanation to this?
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>>
>>>>>>>>  --
>>>>>>>> Cipher Chen
>>>>>>>>
>>>>>>>
>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>> NOTICE: This message is intended for the use of the individual
or
>>>>>>> entity to which it is addressed and may contain information that
is
>>>>>>> confidential, privileged and exempt from disclosure under applicable
law.
>>>>>>> If the reader of this message is not the intended recipient,
you are hereby
>>>>>>> notified that any printing, copying, dissemination, distribution,
>>>>>>> disclosure or forwarding of this communication is strictly prohibited.
If
>>>>>>> you have received this communication in error, please contact
the sender
>>>>>>> immediately and delete it from your system. Thank You.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Jay Vyas
>>>>> http://jayunit100.blogspot.com
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Cipher Chen
>>>
>>
>>


-- 
Cipher Chen

Mime
View raw message