Return-Path: X-Original-To: apmail-ambari-issues-archive@minotaur.apache.org Delivered-To: apmail-ambari-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B66E718240 for ; Wed, 23 Mar 2016 20:09:26 +0000 (UTC) Received: (qmail 10922 invoked by uid 500); 23 Mar 2016 20:09:26 -0000 Delivered-To: apmail-ambari-issues-archive@ambari.apache.org Received: (qmail 10616 invoked by uid 500); 23 Mar 2016 20:09:26 -0000 Mailing-List: contact issues-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ambari.apache.org Delivered-To: mailing list issues@ambari.apache.org Received: (qmail 10465 invoked by uid 99); 23 Mar 2016 20:09:26 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Mar 2016 20:09:26 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id E50082C1F77 for ; Wed, 23 Mar 2016 20:09:25 +0000 (UTC) Date: Wed, 23 Mar 2016 20:09:25 +0000 (UTC) From: "Sandor Magyari (JIRA)" To: issues@ambari.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (AMBARI-15393) Add stderr output of Ambari auto-recovery commands in agent log MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/AMBARI-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandor Magyari updated AMBARI-15393: ------------------------------------ Attachment: (was: AMBARI-15393.patch) > Add stderr output of Ambari auto-recovery commands in agent log > --------------------------------------------------------------- > > Key: AMBARI-15393 > URL: https://issues.apache.org/jira/browse/AMBARI-15393 > Project: Ambari > Issue Type: Bug > Components: ambari-agent > Affects Versions: 2.2.1 > Reporter: Sandor Magyari > Assignee: Sandor Magyari > Priority: Critical > Fix For: 2.2.2 > > > Users rely on Ambari auto-recovery logic to recover from component start failures during cluster create. The idea is to improve reliability (through retries) by sacrificing some of the latency. > In some cases we see that cluster creates fail because component start fails and auto-recovery is unable to start those components for up to 2 hrs, most often on headnodes for HIVE_SERVER, OOZIE_SERVER, and NAMENODE components. > The problem these kind of problems are hard to investigate later, as auto recovery files are not sent to server side nor they are saved in ambari agent logs, only stored on agent . > The solution is to add a new an option log_auto_execute_errors in logging section to ambari-agent.ini. In case this is enabled agent will append stderr of auto recovery command to agent log. -- This message was sent by Atlassian JIRA (v6.3.4#6332)