From issues-return-66416-archive-asf-public=cust-asf.ponee.io@ambari.apache.org Wed Feb 7 15:30:10 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id 024D718065B for ; Wed, 7 Feb 2018 15:30:10 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id E5CD7160C5C; Wed, 7 Feb 2018 14:30:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 1056B160C3C for ; Wed, 7 Feb 2018 15:30:08 +0100 (CET) Received: (qmail 56169 invoked by uid 500); 7 Feb 2018 14:30:08 -0000 Mailing-List: contact issues-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ambari.apache.org Delivered-To: mailing list issues@ambari.apache.org Received: (qmail 56160 invoked by uid 99); 7 Feb 2018 14:30:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Feb 2018 14:30:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id EC868180355 for ; Wed, 7 Feb 2018 14:30:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.511 X-Spam-Level: X-Spam-Status: No, score=-109.511 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id tkeoP_noBq-7 for ; Wed, 7 Feb 2018 14:30:04 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 25CFD5FB58 for ; Wed, 7 Feb 2018 14:30:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 3E40CE0295 for ; Wed, 7 Feb 2018 14:30:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 8A95B2410A for ; Wed, 7 Feb 2018 14:30:00 +0000 (UTC) Date: Wed, 7 Feb 2018 14:30:00 +0000 (UTC) From: "Toshihiro Suzuki (JIRA)" To: issues@ambari.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (AMBARI-22918) Decommission RegionServer fails when kerberos is enabled MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/AMBARI-22918?page=3Dcom.atlassi= an.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D16= 355517#comment-16355517 ]=20 Toshihiro Suzuki commented on AMBARI-22918: ------------------------------------------- When I ran the command in my previous comment, the ps output=C2=A0was as fo= llows: {code:java} root 2891 154 1.4 3226792 87852 pts/0 Sl+ 09:40 0:01 /usr/jdk64/jdk1.8.0_11= 2/bin/java -Dproc_org.jruby.Main -XX:OnOutOfMemoryError=3Dkill -9 %p -Dhdp.= version=3D2.6.2.0-205 -Djava.security.auth.login.config=3D/usr/hdp/current/= hbase-master/conf/hbase_master_jaas.conf -XX:+UseConcMarkSweepGC -XX:ErrorF= ile=3D/var/log/hbase/hs_err_pid%p.log -Djava.security.auth.login.config=3D/= usr/hdp/current/hbase-master/conf/hbase_client_jaas.conf -Djava.io.tmpdir= =3D/tmp -Dhbase.log.dir=3D/var/log/hbase -Dhbase.log.file=3Dhbase.log -Dhba= se.home.dir=3D/usr/hdp/2.6.2.0-205/hbase -Dhbase.id.str=3D -Dhbase.root.log= ger=3DINFO,console -Djava.library.path=3D:/usr/hdp/2.6.2.0-205/hadoop/lib/n= ative/Linux-amd64-64:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-= 64:/usr/hdp/2.6.2.0-205/hadoop/lib/native -Dhbase.security.logger=3DINFO,Nu= llAppender org.jruby.Main /usr/hdp/current/hbase-master/bin/draining_server= s.rb add worker1 {code} It seems like 2 "java.security.auth.login.config"=C2=A0were set. The following is set in the command line. {code:java} -Djava.security.auth.login.config=3D/usr/hdp/current/hbase-master/conf/hbas= e_master_jaas.conf {code} The following is set in hbase-env.sh {code:java} -Djava.security.auth.login.config=3D/usr/hdp/current/hbase-master/conf/hbas= e_client_jaas.conf {code} And as [~rguruvannagari] mentioned, it seems like JVM uses the second one (= at least it did in my env). I thought when we run draining_servers.rb (and = region_mover.rb), we need hbase_master_jaas.conf, but it seems like we don'= t need it. Also, it seems like we don't need even hbase_client_jaas.conf. Even if I re= moved "java.security.auth.login.config" from command line (hbase_master_jaa= s.conf) and hbase-evn.sh (hbase_client_jaas.conf), the command worked. Ther= efore, it looks like we don't need to set any jaas file in this case. And r= egarding a fix for this issue, we can only remove "\{master_security_config= }"=C2=A0as [~rguruvannagari] suggested: {code} 66=09"{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir} org.jruby.Main {reg= ion_drainer} remove {host}")=C2=A0 78=09"{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir} org.jruby.Main {reg= ion_drainer} add {host}")=C2=A0 80=09"{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir} org.jruby.Main {reg= ion_mover} unload {host}")=C2=A0 {code} Any objections? > Decommission RegionServer fails when kerberos is enabled > -------------------------------------------------------- > > Key: AMBARI-22918 > URL: https://issues.apache.org/jira/browse/AMBARI-22918 > Project: Ambari > Issue Type: Bug > Components: ambari-server > Reporter: Toshihiro Suzuki > Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > When=C2=A0kerberos is enabled,=C2=A0Decommission RegionServer fails with = the following errors: > stderr: > {code:java} > Traceback (most recent call last): > File "/var/lib/ambari-agent/cache/common-services/HBASE/0.96.0.2.0/pack= age/scripts/hbase_master.py", line 114, in > HbaseMaster().execute() > File "/usr/lib/python2.6/site-packages/resource_management/libraries/sc= ript/script.py", line 329, in execute > method(env) > File "/var/lib/ambari-agent/cache/common-services/HBASE/0.96.0.2.0/pack= age/scripts/hbase_master.py", line 55, in decommission > hbase_decommission(env) > File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py= ", line 89, in thunk > return fn(*args, **kwargs) > File "/var/lib/ambari-agent/cache/common-services/HBASE/0.96.0.2.0/pack= age/scripts/hbase_decommission.py", line 84, in hbase_decommission > logoutput=3DTrue > File "/usr/lib/python2.6/site-packages/resource_management/core/base.py= ", line 166, in __init__ > self.env.run() > File "/usr/lib/python2.6/site-packages/resource_management/core/environ= ment.py", line 160, in run > self.run_action(resource, action) > File "/usr/lib/python2.6/site-packages/resource_management/core/environ= ment.py", line 124, in run_action > provider_action() > File "/usr/lib/python2.6/site-packages/resource_management/core/provide= rs/system.py", line 262, in action_run > tries=3Dself.resource.tries, try_sleep=3Dself.resource.try_sleep) > File "/usr/lib/python2.6/site-packages/resource_management/core/shell.p= y", line 72, in inner > result =3D function(command, **kwargs) > File "/usr/lib/python2.6/site-packages/resource_management/core/shell.p= y", line 102, in checked_call > tries=3Dtries, try_sleep=3Dtry_sleep, timeout_kill_strategy=3Dtimeout= _kill_strategy) > File "/usr/lib/python2.6/site-packages/resource_management/core/shell.p= y", line 150, in _call_wrapper > result =3D _call(command, **kwargs_copy) > File "/usr/lib/python2.6/site-packages/resource_management/core/shell.p= y", line 303, in _call > raise ExecutionFailed(err_msg, code, out, err) > resource_management.core.exceptions.ExecutionFailed: Execution of '/usr/b= in/kinit -kt /etc/security/keytabs/hbase.service.keytab hbase/master2@EXAMP= LE.COM; /usr/hdp/current/hbase-master/bin/hbase --config /usr/hdp/current/h= base-master/conf -Djava.security.auth.login.config=3D/usr/hdp/current/hbase= -master/conf/hbase_master_jaas.conf org.jruby.Main /usr/hdp/current/hbase-m= aster/bin/draining_servers.rb add worker1' returned 1. Error: Could not fin= d or load main class org.jruby.Main{code} > stdout: > {code:java} > 2018-02-06 07:25:03,453 - Stack Feature Version Info: Cluster Stack=3D2.6= , Cluster Current Version=3D2.6.2.0-205, Command Stack=3DNone, Command Vers= ion=3D2.6.2.0-205 -> 2.6.2.0-205 > 2018-02-06 07:25:03,476 - Using hadoop conf dir: /usr/hdp/current/hadoop-= client/conf > 2018-02-06 07:25:03,484 - checked_call['hostid'] {} > 2018-02-06 07:25:03,490 - checked_call returned (0, '1aacc56c') > 2018-02-06 07:25:03,502 - File['/usr/hdp/current/hbase-master/bin/drainin= g_servers.rb'] {'content': StaticFile('draining_servers.rb'), 'mode': 0755} > 2018-02-06 07:25:03,504 - Execute['/usr/bin/kinit -kt /etc/security/keyta= bs/hbase.service.keytab hbase/master2@EXAMPLE.COM; /usr/hdp/current/hbase-m= aster/bin/hbase --config /usr/hdp/current/hbase-master/conf -Djava.security= .auth.login.config=3D/usr/hdp/current/hbase-master/conf/hbase_master_jaas.c= onf org.jruby.Main /usr/hdp/current/hbase-master/bin/draining_servers.rb ad= d worker1'] {'logoutput': True, 'user': 'hbase'} > Error: Could not find or load main class org.jruby.Main > Command failed after 1 tries{code} > =C2=A0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)