hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2805) RM2 in HA setup tries to login using the RM1's kerberos principal
Date Wed, 05 Nov 2014 20:04:34 GMT

    [ https://issues.apache.org/jira/browse/YARN-2805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14198973#comment-14198973

Wangda Tan commented on YARN-2805:

Had investigated this issue, this is caused by YARN-2795. As pointed by [~xgong], the correct
behavior is, we should setup HA configurations before login.
Uploaded a fix for it. Have done some tests on a HA+security cluster.

Without the patch, one of RM will always fail to start.
With this patch, both RM can be start and one of them will go to standby state.
Tried to stop/start RMs, the active/standby transition is as expected.
Tried to submit MR job to the cluster, job can successfully completed.

Please kindly review.


> RM2 in HA setup tries to login using the RM1's kerberos principal
> -----------------------------------------------------------------
>                 Key: YARN-2805
>                 URL: https://issues.apache.org/jira/browse/YARN-2805
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>            Reporter: Arpit Gupta
>            Assignee: Wangda Tan
>            Priority: Blocker
>         Attachments: YARN-2805.1.patch
> {code}
> 2014-11-04 08:41:08,705 INFO  resourcemanager.ResourceManager (SignalLogger.java:register(91))
- registered UNIX signal handlers for [TERM, HUP, INT]
> 2014-11-04 08:41:10,636 INFO  service.AbstractService (AbstractService.java:noteFailure(272))
- Service ResourceManager failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException:
Failed to login
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Failed to login
> 	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:211)
> 	at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> 	at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1229)
> Caused by: java.io.IOException: Login failure for rm/IP@EXAMPLE.COM from keytab /etc/security/keytabs/rm.service.keytab:
javax.security.auth.login.LoginException: Unable to obtain password from user
> 	at org.apache.hadoop.security.UserGroupInformation.loginUserFromKeytab(UserGroupInformation.java:935)
> {code}

This message was sent by Atlassian JIRA

View raw message