Return-Path: X-Original-To: apmail-ambari-dev-archive@www.apache.org Delivered-To: apmail-ambari-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 15A26179AC for ; Tue, 21 Oct 2014 15:16:34 +0000 (UTC) Received: (qmail 32383 invoked by uid 500); 21 Oct 2014 15:16:33 -0000 Delivered-To: apmail-ambari-dev-archive@ambari.apache.org Received: (qmail 32346 invoked by uid 500); 21 Oct 2014 15:16:33 -0000 Mailing-List: contact dev-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ambari.apache.org Delivered-To: mailing list dev@ambari.apache.org Received: (qmail 32332 invoked by uid 99); 21 Oct 2014 15:16:33 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Oct 2014 15:16:33 +0000 Date: Tue, 21 Oct 2014 15:16:33 +0000 (UTC) From: "Andrew Onischuk (JIRA)" To: dev@ambari.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (AMBARI-7882) Decommission of JobTracker fails on secure cluster MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/AMBARI-7882?page=3Dcom.atlassi= an.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Onischuk resolved AMBARI-7882. ------------------------------------- Resolution: Fixed Committed to trunk and branch-1.7.0 > Decommission of JobTracker fails on secure cluster > -------------------------------------------------- > > Key: AMBARI-7882 > URL: https://issues.apache.org/jira/browse/AMBARI-7882 > Project: Ambari > Issue Type: Bug > Reporter: Andrew Onischuk > Assignee: Andrew Onischuk > Fix For: 1.7.0 > > > Exception text: > =20 > =20 > =20 > { > "href" : "http://ec2-54-165-160-62.compute-1.amazonaws.com:8080/api= /v1/clusters/cl1/requests/21/tasks/235", > "Tasks" : { > "attempt_cnt" : 1, > "cluster_name" : "cl1", > "command" : "CUSTOM_COMMAND", > "command_detail" : "DECOMMISSION, Excluded: ip-172-31-37-151.ec2.= internal", > "custom_command_name" : "DECOMMISSION", > "end_time" : 1413796875994, > "error_log" : "/var/lib/ambari-agent/data/errors-235.txt", > "exit_code" : 1, > "host_name" : "ip-172-31-37-148.ec2.internal", > "id" : 235, > "output_log" : "/var/lib/ambari-agent/data/output-235.txt", > "request_id" : 21, > "role" : "JOBTRACKER", > "stage_id" : 1, > "start_time" : 1413796870551, > "status" : "FAILED", > "stderr" : "2014-10-20 09:21:15,291 - Error while executing comma= nd 'decommission':\nTraceback (most recent call last):\n File \"/usr/lib/p= ython2.6/site-packages/resource_management/libraries/script/script.py\", li= ne 122, in execute\n method(env)\n File \"/var/lib/ambari-agent/cache/s= tacks/HDP/1.3.2/services/MAPREDUCE/package/scripts/jobtracker.py\", line 78= , in decommission\n kinit_override=3DTrue)\n File \"/usr/lib/python2.6/= site-packages/resource_management/core/base.py\", line 148, in __init__\n = self.env.run()\n File \"/usr/lib/python2.6/site-packages/resource_manage= ment/core/environment.py\", line 149, in run\n self.run_action(resource,= action)\n File \"/usr/lib/python2.6/site-packages/resource_management/cor= e/environment.py\", line 115, in run_action\n provider_action()\n File = \"/usr/lib/python2.6/site-packages/resource_management/libraries/providers/= execute_hadoop.py\", line 50, in action_run\n path =3D self.resou= rce.bin_dir\n File \"/usr/lib/python2.6/site-packages/resource_management/= core/base.py\", line 148, in __init__\n self.env.run()\n File \"/usr/li= b/python2.6/site-packages/resource_management/core/environment.py\", line 1= 49, in run\n self.run_action(resource, action)\n File \"/usr/lib/python= 2.6/site-packages/resource_management/core/environment.py\", line 115, in r= un_action\n provider_action()\n File \"/usr/lib/python2.6/site-packages= /resource_management/core/providers/system.py\", line 237, in action_run\n = raise ex\nFail: Execution of 'hadoop --config /etc/hadoop/conf mradmin -= refreshNodes' returned 255. 14/10/20 09:21:15 ERROR security.UserGroupInfor= mation: PriviledgedActionException as:mapred cause:javax.security.sasl.Sasl= Exception: GSS initiate failed [Caused by GSSException: No valid credential= s provided (Mechanism level: Failed to find any Kerberos tgt)]\n14/10/20 09= :21:15 WARN ipc.Client: Exception encountered while connecting to the serve= r : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSEx= ception: No valid credentials provided (Mechanism level: Failed to find any= Kerberos tgt)]\n14/10/20 09:21:15 ERROR security.UserGroupInformation: Pri= viledgedActionException as:mapred cause:java.io.IOException: javax.security= .sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid = credentials provided (Mechanism level: Failed to find any Kerberos tgt)]\nr= efreshNodes: Call to ip-172-31-37-148.ec2.internal/172.31.37.148:50300 fail= ed on local exception: java.io.IOException: javax.security.sasl.SaslExcepti= on: GSS initiate failed [Caused by GSSException: No valid credentials provi= ded (Mechanism level: Failed to find any Kerberos tgt)]", > "stdout" : "2014-10-20 09:21:11,334 - File['/etc/hadoop/conf/mapr= ed.exclude'] {'owner': 'mapred', 'content': Template('exclude_hosts_list.j2= '), 'group': 'hadoop'}\n2014-10-20 09:21:11,338 - Writing File['/etc/hadoop= /conf/mapred.exclude'] because contents don't match\n2014-10-20 09:21:11,33= 9 - ExecuteHadoop['mradmin -refreshNodes'] {'conf_dir': '/etc/hadoop/conf',= 'kinit_override': True, 'user': 'mapred'}\n2014-10-20 09:21:11,341 - Execu= te['hadoop --config /etc/hadoop/conf mradmin -refreshNodes'] {'logoutput': = False, 'path': [], 'tries': 1, 'user': 'mapred', 'try_sleep': 0}\n2014-10-2= 0 09:21:15,291 - Error while executing command 'decommission':\nTraceback (= most recent call last):\n File \"/usr/lib/python2.6/site-packages/resource= _management/libraries/script/script.py\", line 122, in execute\n method(= env)\n File \"/var/lib/ambari-agent/cache/stacks/HDP/1.3.2/services/MAPRED= UCE/package/scripts/jobtracker.py\", line 78, in decommission\n kinit_ov= erride=3DTrue)\n File \"/usr/lib/python2.6/site-packages/resource_manageme= nt/core/base.py\", line 148, in __init__\n self.env.run()\n File \"/usr= /lib/python2.6/site-packages/resource_management/core/environment.py\", lin= e 149, in run\n self.run_action(resource, action)\n File \"/usr/lib/pyt= hon2.6/site-packages/resource_management/core/environment.py\", line 115, i= n run_action\n provider_action()\n File \"/usr/lib/python2.6/site-packa= ges/resource_management/libraries/providers/execute_hadoop.py\", line 50, i= n action_run\n path =3D self.resource.bin_dir\n File \"/usr/lib/= python2.6/site-packages/resource_management/core/base.py\", line 148, in __= init__\n self.env.run()\n File \"/usr/lib/python2.6/site-packages/resou= rce_management/core/environment.py\", line 149, in run\n self.run_action= (resource, action)\n File \"/usr/lib/python2.6/site-packages/resource_mana= gement/core/environment.py\", line 115, in run_action\n provider_action(= )\n File \"/usr/lib/python2.6/site-packages/resource_management/core/provi= ders/system.py\", line 237, in action_run\n raise ex\nFail: Execution of= 'hadoop --config /etc/hadoop/conf mradmin -refreshNodes' returned 255. 14/= 10/20 09:21:15 ERROR security.UserGroupInformation: PriviledgedActionExcept= ion as:mapred cause:javax.security.sasl.SaslException: GSS initiate failed = [Caused by GSSException: No valid credentials provided (Mechanism level: Fa= iled to find any Kerberos tgt)]\n14/10/20 09:21:15 WARN ipc.Client: Excepti= on encountered while connecting to the server : javax.security.sasl.SaslExc= eption: GSS initiate failed [Caused by GSSException: No valid credentials p= rovided (Mechanism level: Failed to find any Kerberos tgt)]\n14/10/20 09:21= :15 ERROR security.UserGroupInformation: PriviledgedActionException as:mapr= ed cause:java.io.IOException: javax.security.sasl.SaslException: GSS initia= te failed [Caused by GSSException: No valid credentials provided (Mechanism= level: Failed to find any Kerberos tgt)]\nrefreshNodes: Call to ip-172-31-= 37-148.ec2.internal/172.31.37.148:50300 failed on local exception: java.io.= IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused= by GSSException: No valid credentials provided (Mechanism level: Failed to= find any Kerberos tgt)]", > "structured_out" : { } > } > } > =20 -- This message was sent by Atlassian JIRA (v6.3.4#6332)