incubator-ambari-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sumit Mohanty <smoha...@hortonworks.com>
Subject Re: heartbeats being ignored
Date Mon, 15 Jul 2013 16:13:27 GMT
Is it possible that the FQDN/hostname of the agent hosts have changed?
E.g. Agents initially registered themselves as host A (you can get that
using API server:8080/api/v1/clusters/<cluster name>/hosts) and after the
network configuration the agents started sending as their heartbeat as B
(server:8080/api/v1/hosts will tell you about the hosts that have
registered)

-Sumit

On 7/15/13 8:47 AM, "Brian Jeltema" <brian.jeltema@digitalenvoy.net> wrote:

>I had to do some network reconfiguration on our cluster. After rebooting
>everything and restarting
>the ambari server and the ambari agents, the server reports (via the UI)
>that it is not receiving heartbeats.
>However, when I look at the server and agent logs, I see heartbeat
>activity:
>
>agent:
>INFO 2013-07-15 11:40:12,169 Heartbeat.py:61 - Sending heartbeat with
>response id: 251 and timestamp: 1373902812168
>INFO 2013-07-15 11:40:12,214 Controller.py:176 - No commands sent from
>the Server.
>
>server
>11:41:44,760  INFO HeartBeatHandler:108 - Received heartbeat from host,
>hostname=foo.net, currentResponseId=260, receivedResponseId=260
>11:41:44,761  INFO AgentResource:109 - Sending heartbeat response with
>response id 261
>
>(response id's don't match because I didn't try to capture them in
>unison). I suspect there may be persisted state in the postgres database
>from the previous network configuration that is causing the problem. Any
>suggestions for a fix short of a complete redeploy?
>
>TIA
>
>Brian



Mime
View raw message