ambari-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Toshihiro Suzuki (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMBARI-22918) Decommission RegionServer fails when kerberos is enabled
Date Wed, 07 Feb 2018 14:30:00 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-22918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16355517#comment-16355517
] 

Toshihiro Suzuki commented on AMBARI-22918:
-------------------------------------------

When I ran the command in my previous comment, the ps output was as follows:
{code:java}
root 2891 154 1.4 3226792 87852 pts/0 Sl+ 09:40 0:01 /usr/jdk64/jdk1.8.0_112/bin/java -Dproc_org.jruby.Main
-XX:OnOutOfMemoryError=kill -9 %p -Dhdp.version=2.6.2.0-205 -Djava.security.auth.login.config=/usr/hdp/current/hbase-master/conf/hbase_master_jaas.conf
-XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hbase/hs_err_pid%p.log -Djava.security.auth.login.config=/usr/hdp/current/hbase-master/conf/hbase_client_jaas.conf
-Djava.io.tmpdir=/tmp -Dhbase.log.dir=/var/log/hbase -Dhbase.log.file=hbase.log -Dhbase.home.dir=/usr/hdp/2.6.2.0-205/hbase
-Dhbase.id.str= -Dhbase.root.logger=INFO,console -Djava.library.path=:/usr/hdp/2.6.2.0-205/hadoop/lib/native/Linux-amd64-64:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:/usr/hdp/2.6.2.0-205/hadoop/lib/native
-Dhbase.security.logger=INFO,NullAppender org.jruby.Main /usr/hdp/current/hbase-master/bin/draining_servers.rb
add worker1
{code}
It seems like 2 "java.security.auth.login.config" were set.

The following is set in the command line.
{code:java}
-Djava.security.auth.login.config=/usr/hdp/current/hbase-master/conf/hbase_master_jaas.conf
{code}
The following is set in hbase-env.sh
{code:java}
-Djava.security.auth.login.config=/usr/hdp/current/hbase-master/conf/hbase_client_jaas.conf
{code}
And as [~rguruvannagari] mentioned, it seems like JVM uses the second one (at least it did
in my env). I thought when we run draining_servers.rb (and region_mover.rb), we need hbase_master_jaas.conf,
but it seems like we don't need it.

Also, it seems like we don't need even hbase_client_jaas.conf. Even if I removed "java.security.auth.login.config"
from command line (hbase_master_jaas.conf) and hbase-evn.sh (hbase_client_jaas.conf), the
command worked. Therefore, it looks like we don't need to set any jaas file in this case.
And regarding a fix for this issue, we can only remove "\{master_security_config}" as [~rguruvannagari]
suggested:
{code}
66	"{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir} org.jruby.Main {region_drainer} remove
{host}") 
78	"{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir} org.jruby.Main {region_drainer} add
{host}") 
80	"{kinit_cmd} {hbase_cmd} --config {hbase_conf_dir} org.jruby.Main {region_mover} unload
{host}") 
{code}

Any objections?


> Decommission RegionServer fails when kerberos is enabled
> --------------------------------------------------------
>
>                 Key: AMBARI-22918
>                 URL: https://issues.apache.org/jira/browse/AMBARI-22918
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>            Reporter: Toshihiro Suzuki
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When kerberos is enabled, Decommission RegionServer fails with the following errors:
> stderr:
> {code:java}
> Traceback (most recent call last):
>   File "/var/lib/ambari-agent/cache/common-services/HBASE/0.96.0.2.0/package/scripts/hbase_master.py",
line 114, in <module>
>     HbaseMaster().execute()
>   File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 329, in execute
>     method(env)
>   File "/var/lib/ambari-agent/cache/common-services/HBASE/0.96.0.2.0/package/scripts/hbase_master.py",
line 55, in decommission
>     hbase_decommission(env)
>   File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89,
in thunk
>     return fn(*args, **kwargs)
>   File "/var/lib/ambari-agent/cache/common-services/HBASE/0.96.0.2.0/package/scripts/hbase_decommission.py",
line 84, in hbase_decommission
>     logoutput=True
>   File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 166,
in __init__
>     self.env.run()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
160, in run
>     self.run_action(resource, action)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line
124, in run_action
>     provider_action()
>   File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 262, in action_run
>     tries=self.resource.tries, try_sleep=self.resource.try_sleep)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72,
in inner
>     result = function(command, **kwargs)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102,
in checked_call
>     tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 150,
in _call_wrapper
>     result = _call(command, **kwargs_copy)
>   File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 303,
in _call
>     raise ExecutionFailed(err_msg, code, out, err)
> resource_management.core.exceptions.ExecutionFailed: Execution of '/usr/bin/kinit -kt
/etc/security/keytabs/hbase.service.keytab hbase/master2@EXAMPLE.COM; /usr/hdp/current/hbase-master/bin/hbase
--config /usr/hdp/current/hbase-master/conf -Djava.security.auth.login.config=/usr/hdp/current/hbase-master/conf/hbase_master_jaas.conf
org.jruby.Main /usr/hdp/current/hbase-master/bin/draining_servers.rb add worker1' returned
1. Error: Could not find or load main class org.jruby.Main{code}
> stdout:
> {code:java}
> 2018-02-06 07:25:03,453 - Stack Feature Version Info: Cluster Stack=2.6, Cluster Current
Version=2.6.2.0-205, Command Stack=None, Command Version=2.6.2.0-205 -> 2.6.2.0-205
> 2018-02-06 07:25:03,476 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
> 2018-02-06 07:25:03,484 - checked_call['hostid'] {}
> 2018-02-06 07:25:03,490 - checked_call returned (0, '1aacc56c')
> 2018-02-06 07:25:03,502 - File['/usr/hdp/current/hbase-master/bin/draining_servers.rb']
{'content': StaticFile('draining_servers.rb'), 'mode': 0755}
> 2018-02-06 07:25:03,504 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/hbase.service.keytab
hbase/master2@EXAMPLE.COM; /usr/hdp/current/hbase-master/bin/hbase --config /usr/hdp/current/hbase-master/conf
-Djava.security.auth.login.config=/usr/hdp/current/hbase-master/conf/hbase_master_jaas.conf
org.jruby.Main /usr/hdp/current/hbase-master/bin/draining_servers.rb add worker1'] {'logoutput':
True, 'user': 'hbase'}
> Error: Could not find or load main class org.jruby.Main
> Command failed after 1 tries{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message