ambari-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMBARI-21593) RU: AMS stopped after RU [AMS distributed mode]
Date Fri, 28 Jul 2017 08:52:00 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-21593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104676#comment-16104676
] 

Hadoop QA commented on AMBARI-21593:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12879292/AMBARI-21593.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include any new or modified
tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in ambari-metrics/ambari-metrics-timelineservice.

Console output: https://builds.apache.org/job/Ambari-trunk-test-patch/11882//console

This message is automatically generated.

> RU: AMS stopped after RU [AMS distributed mode]
> -----------------------------------------------
>
>                 Key: AMBARI-21593
>                 URL: https://issues.apache.org/jira/browse/AMBARI-21593
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-metrics
>    Affects Versions: 2.5.2
>            Reporter: Aravindan Vijayan
>            Assignee: Aravindan Vijayan
>            Priority: Blocker
>             Fix For: 2.5.2
>
>         Attachments: AMBARI-21593.patch
>
>
> *PROBLEM*
> When 2 metric collectors are started up simultaneously, both of them fail to start.
> *BUG*
> There exists a race condition in the Metric Collector HA controller initialization which
was introduced through AMBARI-20179. When a helix controller instance finds that the /ambari-metrics-collector
znode exists but a child node does not exists, it deletes the entire znode and recreates.
If another controller instance also initializes simultaneously, a race condition can occur
wherein each instance will end up cancelling the effort of the other. 
> *FIX*
> Do not delete and recreate the znode. Wait and retry for a few seconds to check if /ambari-metrics-collector
was fully initailized. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message