ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMBARI-14847) Concurrent kinit Commands Cause Alerts To Randomly Trigger
Date Mon, 01 Feb 2016 17:32:39 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-14847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126610#comment-15126610
] 

Hudson commented on AMBARI-14847:
---------------------------------

FAILURE: Integrated in Ambari-branch-2.2 #276 (See [https://builds.apache.org/job/Ambari-branch-2.2/276/])
AMBARI-14847 - Concurrent kinit Commands Cause Alerts To Randomly (jhurley: [http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=commit&h=5d1a67d6f65104c715f61a4d4adad9bb0c48e90b])
* ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/alerts/alert_webhcat_server.py
* ambari-server/src/test/python/TestGlobalLock.py
* ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/alerts/alert_hive_metastore.py
* ambari-common/src/main/python/resource_management/libraries/functions/hive_check.py
* ambari-common/src/main/python/resource_management/core/global_lock.py
* ambari-server/src/main/resources/common-services/OOZIE/4.0.0.2.0/package/alerts/alert_check_oozie_server.py
* ambari-common/src/main/python/resource_management/libraries/functions/curl_krb_request.py


> Concurrent kinit Commands Cause Alerts To Randomly Trigger
> ----------------------------------------------------------
>
>                 Key: AMBARI-14847
>                 URL: https://issues.apache.org/jira/browse/AMBARI-14847
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-agent
>    Affects Versions: 2.0.0
>            Reporter: Jonathan Hurley
>            Assignee: Jonathan Hurley
>            Priority: Critical
>             Fix For: 2.2.2
>
>         Attachments: AMBARI-14847.patch
>
>
> The alerts framework on each Ambari Agent runs alerts in a threadpool when the job triggers.
This can cause the following error to randomly appear and the alert to go CRITICAL:
> {noformat}
>  Connection failed to http://nat-rare-21-dvitiiuk-2-5.novalocal:8088 (Execution of '/usr/bin/kinit
-l 5m -c /var/lib/ambari-agent/tmp/web_alert_cc_f3f99363c3b7d1667f1287ce3a35aa52 -kt /etc/security/keytabs/spnego.service.keytab
HTTP/nat-rare-21-dvitiiuk-2-5.novalocal@EXAMPLE.COM > /dev/null' returned 1.
> kinit: Internal credentials cache error while storing credentials while getting initial
credentials)
> {noformat}
> The alerts would randomly go CRITICAL at the end of their ticket expiration time only
to become OK again shortly after. 
> The cause is that the {{kinit}} command being executed to create new credentials cannot
be run concurrently for the same user. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message