incubator-ambari-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Smith <christ...@greenbutton.com>
Subject Re: Ambari server claiming no heartbeats from agents
Date Sat, 07 Sep 2013 23:21:35 GMT
Hi Sumit,

It seems that still hasn't fixed the issue.  The clocks are synced and
services restarted.  From the logs I see:

Agent:
INFO 2013-09-07 23:11:08,906 Heartbeat.py:61 - Sending heartbeat with
response id: 4 and timestamp: 1378595468906

Server:
23:11:09,795  INFO HeartBeatHandler:108 - Received heartbeat from host,
hostname=hadoop-cluster-1-1-10613434.greenbutton.local,
currentResponseId=4, receivedResponseId=4

The UI still reports that it hasn't received a heartbeat from that agent in
over 3 minutes.

I've attached screen shots that show all hostnames are aligned.

Thanks,
Christian[image: Inline image 2][image: Inline image 3][image: Inline image
1]



On Sun, Sep 8, 2013 at 9:57 AM, Christian Smith
<christian@greenbutton.com>wrote:

> Hi Sumit,
>
> It seems the clocks are off, I should have checked that earlier!  Thanks
> for you help.
>
> -Christian
>
>
>
>
> On Sun, Sep 8, 2013 at 1:38 AM, Sumit Mohanty <smohanty@hortonworks.com>wrote:
>
>> Hi Christian,
>>
>> Heartbeat hostname not aligning with the registered hostname is the most
>> likely reason.
>>
>> Try these API calls to confirm:
>> curl –u user:passwd http://AmbariHost:8080/api/v1/hosts –this will tell
>> you how many hosts are registered and their hostname (FQDN is what is
>> typically used for registration)
>>
>> You can compare that with
>> curl –u user:passwd
>> http://AmbariHost:8080/api/v1/clusters/YourClusterName/hosts<http://AmbariHost:8080/api/v1/hosts>
–
>> tells you the list of hosts that the cluster is associated with
>>
>> If indeed there is a hostname mismatch, you can modify the hostname on
>> the host itself and restart the agent.
>>
>> If you can't modify the hostname for some reason, let us know. There is a
>> way for ambari agents to override the host supplied hostname as well.
>> However, the prior solution is preferred.
>>
>> -Sumit
>> From: Christian Smith <christian@greenbutton.com>
>> Reply-To: <ambari-user@incubator.apache.org>
>> Date: Saturday, September 7, 2013 2:56 AM
>> To: "ambari-user@incubator.apache.org" <ambari-user@incubator.apache.org>
>> Subject: Ambari server claiming no heartbeats from agents
>>
>> Hi,
>>
>> I've got a new cluster configured via the API with HDFS and MR.  The
>> configuration went fine and the HDFS service says its running.  However, on
>> the hosts tab, all hosts are marked with a yellow circle and state that no
>> heartbeat has been received for over 3 minutes.
>>
>> I've checked the agent and server logs and heartbeats are being sent and
>> received by the expected parties.  So my question is what could be going
>> wrong?  And how does the server associate a received heartbeat with a host
>> in the cluster config?  Does the server to a reserve DNS lookup of the
>> heartbeats source IP?  Or does the heartbeat contain the hostname of the
>> agent?
>>
>> It seems like something around the heartbeat hostname is not aligned with
>> what the server is expecting...
>>
>> Any ideas how to debug further?
>>
>> Cheers,
>> Christian
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>

Mime
View raw message