ambari-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Novogrodsky <david.novogrod...@gmail.com>
Subject Re: Problem with Ambari 1.7 recognizing hosts running CentOS 6
Date Tue, 16 Dec 2014 18:15:50 GMT
There is nothing simply done in Ambari.  :)

By changing the name of this computer and restarting the namenode  Ambari
does not recogize any node.  The main error I am wondering about is this:
INFO 2014-12-16 12:02:29,669 main.py:233 - Connecting to Ambari server at
https://namenode.localdomain:8440 (98.124.198.1)
INFO 2014-12-16 12:02:29,670 NetUtil.py:48 - Connecting to
https://namenode.localdomain:8440/ca
WARNING 2014-12-16 12:02:29,718 NetUtil.py:71 - Failed to connect to
https://namenode.localdomain:8440/ca due to [Errno 111] Connection refused
WARNING 2014-12-16 12:02:29,719 NetUtil.py:92 - Server at
https://namenode.localdomain:8440 is not reachable, sleeping for 10
seconds...
', None)
Why is Ambari using namenode.localdomain to connect?

I am running Ambari on this node; I am running Ambari on the namenode of
this cluster.  The host file for this computer is this:
  GNU nano 2.0.9              File:
/etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4
localhost4.localdomain4
::1         localhost localhost.localdomain localhost6
localhost6.localdomain6
192.168.200.144 localhost.datanode10
192.168.200.107 localhost.datanode01
192.168.200.143 namenode.localdomain.com namenode

The Ambari wizard said I needed to use fully qualified domain names, so

What follows is a detailed log of the registration log.  I get this error
in the registration log for namenode.localdomain.com:
--
==========================
Creating target directory...
==========================

Command start time 2014-12-16 12:02:18

Connection to namenode.localdomain.com closed.
SSH command execution finished
host=namenode.localdomain.com, exitcode=0
Command end time 2014-12-16 12:02:18

==========================
Copying common functions script...
==========================

Command start time 2014-12-16 12:02:18

scp /usr/lib/python2.6/site-packages/ambari_commons
host=namenode.localdomain.com, exitcode=0
Command end time 2014-12-16 12:02:18

==========================
Copying OS type check script...
==========================

Command start time 2014-12-16 12:02:18

scp /usr/lib/python2.6/site-packages/ambari_server/os_check_type.py
host=namenode.localdomain.com, exitcode=0
Command end time 2014-12-16 12:02:18

==========================
Running OS type check...
==========================

Command start time 2014-12-16 12:02:18
Cluster primary/cluster OS type is redhat6 and local/current OS type is
redhat6

Connection to namenode.localdomain.com closed.
SSH command execution finished
host=namenode.localdomain.com, exitcode=0
Command end time 2014-12-16 12:02:19

==========================
Checking 'sudo' package on remote host...
==========================

Command start time 2014-12-16 12:02:19
sudo-1.8.6p3-15.el6.x86_64

Connection to namenode.localdomain.com closed.
SSH command execution finished
host=namenode.localdomain.com, exitcode=0
Command end time 2014-12-16 12:02:20

==========================
Copying repo file to 'tmp' folder...
==========================

Command start time 2014-12-16 12:02:20

scp /etc/yum.repos.d/ambari.repo
host=namenode.localdomain.com, exitcode=0
Command end time 2014-12-16 12:02:20

==========================
Moving file to repo dir...
==========================

Command start time 2014-12-16 12:02:20

Connection to namenode.localdomain.com closed.
SSH command execution finished
host=namenode.localdomain.com, exitcode=0
Command end time 2014-12-16 12:02:21

==========================
Copying setup script file...
==========================

Command start time 2014-12-16 12:02:21

scp /usr/lib/python2.6/site-packages/ambari_server/setupAgent.py
host=namenode.localdomain.com, exitcode=0
Command end time 2014-12-16 12:02:21

==========================
Running setup agent script...
==========================

Command start time 2014-12-16 12:02:21
Verifying Python version compatibility...
Using python  /usr/bin/python2.6
Found ambari-agent PID: 5036
Stopping ambari-agent
Removing PID file at /var/run/ambari-agent/ambari-agent.pid
ambari-agent successfully stopped
Restarting ambari-agent
Verifying Python version compatibility...
Using python  /usr/bin/python2.6
ambari-agent is not running. No PID found at
/var/run/ambari-agent/ambari-agent.pid
Verifying Python version compatibility...
Using python  /usr/bin/python2.6
Checking for previously running Ambari Agent...
Starting ambari-agent
Verifying ambari-agent process status...
Ambari Agent successfully started
Agent PID at: /var/run/ambari-agent/ambari-agent.pid
Agent out at: /var/log/ambari-agent/ambari-agent.out
Agent log at: /var/log/ambari-agent/ambari-agent.log
('WARNING 2014-12-16 12:01:59,642 NetUtil.py:92 - Server at
https://namenode.localdomain:8440 is not reachable, sleeping for 10
seconds...
INFO 2014-12-16 12:02:09,653 NetUtil.py:48 - Connecting to
https://namenode.localdomain:8440/ca
WARNING 2014-12-16 12:02:09,701 NetUtil.py:71 - Failed to connect to
https://namenode.localdomain:8440/ca due to [Errno 111] Connection refused
WARNING 2014-12-16 12:02:09,701 NetUtil.py:92 - Server at
https://namenode.localdomain:8440 is not reachable, sleeping for 10
seconds...
INFO 2014-12-16 12:02:19,711 NetUtil.py:48 - Connecting to
https://namenode.localdomain:8440/ca
WARNING 2014-12-16 12:02:19,770 NetUtil.py:71 - Failed to connect to
https://namenode.localdomain:8440/ca due to [Errno 111] Connection refused
WARNING 2014-12-16 12:02:19,770 NetUtil.py:92 - Server at
https://namenode.localdomain:8440 is not reachable, sleeping for 10
seconds...
INFO 2014-12-16 12:02:22,680 main.py:83 - loglevel=logging.INFO
INFO 2014-12-16 12:02:22,681 main.py:55 - signal received, exiting.
INFO 2014-12-16 12:02:22,681 ProcessHelper.py:39 - Removing pid file
INFO 2014-12-16 12:02:22,681 ProcessHelper.py:46 - Removing temp files
INFO 2014-12-16 12:02:29,532 main.py:83 - loglevel=logging.INFO
INFO 2014-12-16 12:02:29,533 DataCleaner.py:36 - Data cleanup thread started
INFO 2014-12-16 12:02:29,534 DataCleaner.py:117 - Data cleanup started
INFO 2014-12-16 12:02:29,542 DataCleaner.py:119 - Data cleanup finished
INFO 2014-12-16 12:02:29,667 PingPortListener.py:51 - Ping port listener
started on port: 8670
INFO 2014-12-16 12:02:29,669 main.py:233 - Connecting to Ambari server at
https://namenode.localdomain:8440 (98.124.198.1)
INFO 2014-12-16 12:02:29,670 NetUtil.py:48 - Connecting to
https://namenode.localdomain:8440/ca
WARNING 2014-12-16 12:02:29,718 NetUtil.py:71 - Failed to connect to
https://namenode.localdomain:8440/ca due to [Errno 111] Connection refused
WARNING 2014-12-16 12:02:29,719 NetUtil.py:92 - Server at
https://namenode.localdomain:8440 is not reachable, sleeping for 10
seconds...
', None)

Connection to namenode.localdomain.com closed.
SSH command execution finished
host=namenode.localdomain.com, exitcode=0
Command end time 2014-12-16 12:02:32

Registering with the server...
Registration with the server failed.
----

David Novogrodsky
david.novogrodsky@gmail.com
http://www.linkedin.com/in/davidnovogrodsky

On Mon, Dec 15, 2014 at 10:02 PM, Devopam Mittra <devopam@gmail.com> wrote:
>
> May I suggest you simply do a ssh -l <keylessusername> using the previous
> and the new FQDNs that you have defined to verify which one is in effect,
> and accessible ?
> Also, since you changed the FQDN, you may wish to simply reboot the
> cluster once, just to make sure that new ones are in-place.
> It might happen that after the reboot you will need to redo the ssh
> keyless pairing once again (most probably)
>
> regards
> Devopam
>
>
> On Tue, Dec 16, 2014 at 4:32 AM, David Novogrodsky <
> david.novogrodsky@gmail.com> wrote:
>>
>> The changes I am making in the hosts file are not being picked up by the
>> installation scripts of Ambari.  I was told I could make changes to the
>> hosts file and that Ambari would see them.  I have
>> checked the etc/ambari-agent/conf/ambari-agent.ini file and the changes I
>> made to the hosts file are not showing up in that file.  Where is Ambari
>> getting the names for the other nodes in the cluster?
>>
>> Here are the changes I made to the hosts file on the host for the name
>> node:
>> 127.0.0.1   localhost localhost.localdomain localhost4
>> localhost4.localdomain4
>> ::1         localhost localhost.localdomain localhost6
>> localhost6.localdomain6
>> 192.168.200.144 datanode10.localdomain
>> 192.168.200.107 datanode01.localdomain
>> 192.168.200.143 namenode.localdomain namenode
>>
>> Since I made these changes Ambari can not discover any of the nodes in
>> the network.  None of them.
>>
>> I have not made these changes to the other nodes because I do not want to
>> make changes to the other nodes until I can see Ambari discover the host it
>> is sitting upon.
>>
>> Regarding the commands you mentioned, here are the results:
>> [root@localhost conf]# hostname -f
>> hostname: Unknown host
>> [root@localhost conf]# hostname
>> localhost.namenode
>> [root@localhost conf]#  python -c 'import socket; print socket.getfqdn()'
>> localhost.namenode
>>
>> localhost.namenode was the name for I used for this host during the
>> installation of CentOS.   I thought you said i could make changes to the
>> hosts file and the installation scripts would recognize them?
>>
>> From the Confirm Hosts page I am getting the following errors:
>> for connecting to the name node
>>
>> STDOUT: {'exitstatus': 1, 'log': "Host registration aborted. Ambari Agent host
>> cannot reach Ambari Server 'localhost.namenode:8080'. Please check the network
>> connectivity between the Ambari Agent host and the Ambari Server"}
>>
>> for connecting to the datanode10
>>
>> INFO 2014-12-15 16:42:33,348 DataCleaner.py:36 - Data cleanup thread started
>> ERROR 2014-12-15 16:42:33,349 main.py:137 - Ambari agent machine hostname
>>  (localhost.datanode10) does not match expected ambari server hostname
>> (datanode10.localdomain). Aborting registration. Please check hostname,
>> hostname -f and /etc/hosts file to confirm your hostname is setup correctly
>> ', None)
>>
>> I am getting similiar error when trying to get to the datanode01.  Please
>> note I used the following domain names for the following datanodes when I
>> installed the CentOS
>> datanode 10 --> localhost.datanode10
>> datanode01 --> localhost.datanode01
>>
>>
>>
>>
>>
>> David Novogrodsky
>> david.novogrodsky@gmail.com
>> http://www.linkedin.com/in/davidnovogrodsky
>>
>> On Mon, Dec 15, 2014 at 11:50 AM, Yusaku Sako <yusaku@hortonworks.com>
>> wrote:
>>>
>>> Did you change the FQDNs like I proposed, like namenode.localdomain,
>>> rather than localhost.namenode?
>>> Did you ensure that the 3 commands returned the results as shown?
>>> Can each host resolve all the other hosts by name?
>>>
>>> If you want to get a cluster up and running on VMs, the best bet is to
>>> use:
>>> https://cwiki.apache.org/confluence/display/AMBARI/Quick+Start+Guide
>>>
>>> This sets up all /etc/hosts and other settings in the way you want.
>>> Then you can see how these VMs are being set up and mimic on your VMs if
>>> you'd rather set them up from scratch.
>>>
>>> I hope this helps.
>>> Yusaku
>>>
>>>
>>> On Mon, Dec 15, 2014 at 8:18 AM, David Novogrodsky <
>>> david.novogrodsky@gmail.com> wrote:
>>>>
>>>> Ok, I removed the multiple instances onf localhost.namenode.  It now
>>>> only appears on one line in the hosts file.
>>>>
>>>> The main ambari server still cannot see the data nodes nor the node
>>>> Ambari is on.  Ambari is on the namenode.  When I run the install, the
>>>> install program can not connect to any node in the network.
>>>>
>>>> Also I tried running /etc/init.d/network restart on one of the nodes;
>>>> datanode10 ( a virtual machine).  Now that node cannot connect to the
>>>> internet....I would like to send you the information but I am having
>>>> problems setting the document from the virtual machine.
>>>>
>>>> I do not have a DNS.  These machines have hardwired IP addresses and
>>>> names in the host file. Did runn /etc/init.d/network restart break the
>>>> connection?
>>>>
>>>>
>>>> David Novogrodsky
>>>> david.novogrodsky@gmail.com
>>>> http://www.linkedin.com/in/davidnovogrodsky
>>>>
>>>> On Sat, Dec 13, 2014 at 12:46 AM, Yusaku Sako <yusaku@hortonworks.com>
>>>> wrote:
>>>>>
>>>>> You can just make the changes in /etc/hosts.  You might also
>>>>> change /etc/sysconfig/network and run /etc/init.d/network restart.
>>>>>
>>>>> Then make sure that running the 3 commands return expected results.
>>>>>
>>>>> Yusaku
>>>>>
>>>>> On Fri, Dec 12, 2014 at 9:06 PM, David Novogrodsky <
>>>>> david.novogrodsky@gmail.com> wrote:
>>>>>>
>>>>>> ​When I installed the CentOS on the machines, I chose those name,
>>>>>> localhost.datanode01...and so on.  You mean I have to reinstall CentOS
on
>>>>>> the machines again?
>>>>>>
>>>>>> Can I just make the changes in the host files?
>>>>>>
>>>>>> Will I need to recreate the SSH keys?.​
>>>>>>
>>>>>> David Novogrodsky
>>>>>> david.novogrodsky@gmail.com
>>>>>> http://www.linkedin.com/in/davidnovogrodsky
>>>>>>
>>>>>> On Fri, Dec 12, 2014 at 6:21 PM, Yusaku Sako <yusaku@hortonworks.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I would set it up like this:
>>>>>>>
>>>>>>> 127.0.0.1 localhost localhost.localdomain localhost4
>>>>>>> localhost4.localdomain4*   <- do not list the hostname here.
*
>>>>>>> ::1 localhost localhost.localdomain localhost6
>>>>>>> localhost6.localdomain6
>>>>>>> xxx.xxx.200.144 datanode10.localdomain
>>>>>>> xxx.xxx.200.107 datanode01.localdomain
>>>>>>> xxx.xxx.200.143 namenode.localdomain namenode
>>>>>>>
>>>>>>> With this change:
>>>>>>> * *hostname -f* should display *namenode.localdomain*
>>>>>>> * *hostname* should display *namenode*
>>>>>>> * *python -c 'import socket; print socket.getfqdn()' *should
>>>>>>> display *namenode.localdomain*
>>>>>>>
>>>>>>> I hope this helps.
>>>>>>> Yusaku
>>>>>>>
>>>>>>> On Fri, Dec 12, 2014 at 3:52 PM, David Novogrodsky <
>>>>>>> david.novogrodsky@gmail.com> wrote:
>>>>>>>>
>>>>>>>> All,
>>>>>>>>
>>>>>>>> I am having a problem with Ambari.
>>>>>>>> I am trying to use Ambari to install Hadoop to a three node
>>>>>>>> cluster. the name node is where the Ambari server is located.
I am getting
>>>>>>>> this error:
>>>>>>>> ERROR 2014-12-12 17:39:56,963 main.py:137 – Ambari agent
machine
>>>>>>>> hostname (localhost.localdomain) does not match expected
ambari server
>>>>>>>> hostname (namenode). Aborting registration. Please check
hostname, hostname
>>>>>>>> -f and /etc/hosts file to confirm your hostname is setup
correctly
>>>>>>>> ‘, None)
>>>>>>>>
>>>>>>>> Here is the contents of my hosts file:
>>>>>>>> 127.0.0.1 localhost localhost.localdomain localhost4
>>>>>>>> localhost4.localdomain4 localhost.namenode namenode
>>>>>>>> ::1 localhost localhost.localdomain localhost6
>>>>>>>> localhost6.localdomain6
>>>>>>>> xxx.xxx.200.144 localhost.datanode10
>>>>>>>> xxx.xxx.200.107 localhost.datanode01
>>>>>>>> xxx.xxx.200.143 localhost.namenode namenode
>>>>>>>>
>>>>>>>> I am not sure what the problem is. Since there are only four
steps
>>>>>>>> to run ambari there is not a lot of background to determine
the cause of
>>>>>>>> this problem.
>>>>>>>>
>>>>>>>> David Novogrodsky
>>>>>>>> david.novogrodsky@gmail.com
>>>>>>>> http://www.linkedin.com/in/davidnovogrodsky
>>>>>>>>
>>>>>>>
>>>>>>> CONFIDENTIALITY NOTICE
>>>>>>> NOTICE: This message is intended for the use of the individual
or
>>>>>>> entity to which it is addressed and may contain information that
is
>>>>>>> confidential, privileged and exempt from disclosure under applicable
law.
>>>>>>> If the reader of this message is not the intended recipient,
you are hereby
>>>>>>> notified that any printing, copying, dissemination, distribution,
>>>>>>> disclosure or forwarding of this communication is strictly prohibited.
If
>>>>>>> you have received this communication in error, please contact
the sender
>>>>>>> immediately and delete it from your system. Thank You.
>>>>>>
>>>>>>
>>>>> CONFIDENTIALITY NOTICE
>>>>> NOTICE: This message is intended for the use of the individual or
>>>>> entity to which it is addressed and may contain information that is
>>>>> confidential, privileged and exempt from disclosure under applicable
law.
>>>>> If the reader of this message is not the intended recipient, you are
hereby
>>>>> notified that any printing, copying, dissemination, distribution,
>>>>> disclosure or forwarding of this communication is strictly prohibited.
If
>>>>> you have received this communication in error, please contact the sender
>>>>> immediately and delete it from your system. Thank You.
>>>>>
>>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>>
>>
>
> --
> Devopam Mittra
> Life and Relations are not binary
>

Mime
View raw message