ambari-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sandor Magyari (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AMBARI-15393) Add stderr output of Ambari auto-recovery commands in agent log
Date Fri, 11 Mar 2016 17:40:39 GMT

     [ https://issues.apache.org/jira/browse/AMBARI-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sandor Magyari updated AMBARI-15393:
------------------------------------
    Status: Patch Available  (was: Open)

> Add stderr output of Ambari auto-recovery commands in agent log
> ---------------------------------------------------------------
>
>                 Key: AMBARI-15393
>                 URL: https://issues.apache.org/jira/browse/AMBARI-15393
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-agent
>    Affects Versions: 2.2.1
>            Reporter: Sandor Magyari
>            Assignee: Sandor Magyari
>            Priority: Critical
>             Fix For: 2.2.2
>
>         Attachments: AMBARI-15393.patch
>
>
> Customers rely on Ambari auto-recovery logic to recover from component start failures
during cluster create. The idea is to improve reliability (through retries) by sacrificing
some of the latency.
> In some cases we see that cluster creates fail because component start fails and auto-recovery
is unable to start those components for up to 2 hrs, most often on headnodes for HIVE_SERVER,
OOZIE_SERVER, and NAMENODE components.
> The problem these kind of problems are hard to investigate later, as auto recovery files
are not sent to server side nor they are saved in ambari agent logs, only stored on agent
. 
> The solution is to add a new an option log_auto_execute_errors in logging section to
ambari-agent.ini. In case this is enabled agent will append stderr of auto recovery command
to agent log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message