ambari-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sebastian Toader (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AMBARI-19416) Ambari agents remain in heartbeat lost state after ambari server restart
Date Sun, 08 Jan 2017 09:11:58 GMT

     [ https://issues.apache.org/jira/browse/AMBARI-19416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sebastian Toader updated AMBARI-19416:
--------------------------------------
    Attachment: AMBARI-19416.v1.patch

> Ambari agents remain in heartbeat lost state after ambari server restart
> ------------------------------------------------------------------------
>
>                 Key: AMBARI-19416
>                 URL: https://issues.apache.org/jira/browse/AMBARI-19416
>             Project: Ambari
>          Issue Type: Bug
>            Reporter: Sebastian Toader
>            Assignee: Sebastian Toader
>            Priority: Critical
>             Fix For: 3.0.0
>
>         Attachments: AMBARI-19416.v1.patch
>
>
> With the implementation https://issues.apache.org/jira/browse/AMBARI-18505 the execution
of status commands is done in a separate child process. Status commands received from the
server by ambari agent are passed to the status command executor child process via Queue ({{multiprocessing.Queue()}}.
In case the child process is killed, either manually or by the parent process the queue may
end up in bad state (see: http://bugs.python.org/issue20527) thus the re-spawned status command
executor child process may not receive new status commands any more.
> When ambari server is restarted the agent re-registers with ambari server and upon re-registration
it re-spawns the status command child process in order to receive up to date agent configs
(https://issues.apache.org/jira/browse/AMBARI-19392). In this case the status commands won't
be received by the status command executor child process due the queue may get stuck leading
the ambari agent to stay in heatbeat lost state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message