hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "genericqa (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7692) Resource Manager goes down when a user not included in a priority acl submits a job
Date Wed, 03 Jan 2018 10:38:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309449#comment-16309449
] 

genericqa commented on YARN-7692:
---------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 24s{color} | {color:blue}
Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} |
{color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  0s{color} | {color:red}
The patch doesn't appear to include any new or modified tests. Please justify why no new tests
are needed for this patch. Also please list what manual steps were performed to verify this
patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 11s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 36s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 23s{color}
| {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 38s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 14s{color}
| {color:green} branch has no errors when building and testing our client artifacts. {color}
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 57s{color} |
{color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 21s{color} |
{color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 36s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 31s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 31s{color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 20s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 34s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m  0s{color}
| {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  9m 28s{color}
| {color:green} patch has no errors when building and testing our client artifacts. {color}
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  9s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 22s{color} |
{color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 64m 26s{color} | {color:green}
hadoop-yarn-server-resourcemanager in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 20s{color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}106m 16s{color} | {color:black}
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7692 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12904001/YARN-7692.001.patch
|
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  unit  shadedclient
 findbugs  checkstyle  |
| uname | Linux b692dfe701cf 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017
x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / c0c7cce |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
|  Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/19082/testReport/ |
| Max. process+thread count | 861 (vs. ulimit of 5000) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager |
| Console output | https://builds.apache.org/job/PreCommit-YARN-Build/19082/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Resource Manager goes down when a user not included in a priority acl submits a job
> -----------------------------------------------------------------------------------
>
>                 Key: YARN-7692
>                 URL: https://issues.apache.org/jira/browse/YARN-7692
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.9.0, 2.8.3, 3.0.0
>            Reporter: Charan Hebri
>            Assignee: Sunil G
>            Priority: Blocker
>         Attachments: YARN-7692.001.patch
>
>
> Test scenario
> ------------------
> 1. A cluster is created, no ACLs are included
> 2. Submit jobs with an existing user say 'user_a'
> 3. Enable ACLs and create a priority ACL entry via the property yarn.scheduler.capacity.priority-acls.
Do not include the user, 'user_a' in this ACL.
> 4. Submit a job with the 'user_a'
> The observed behavior in this case is that the job is rejected as 'user_a' does not have
the permission to run the job which is expected behavior. But Resource Manager also goes down
when it tries to recover previous applications and fails to recover them.
> Below is the exception seen,
> {noformat}
> 2017-12-27 10:52:30,064 INFO  conf.Configuration (Configuration.java:getConfResourceAsInputStream(2659))
- found resource yarn-site.xml at file:/etc/hadoop/3.0.0.0-636/0/yarn-site.xml
> 2017-12-27 10:52:30,065 INFO  scheduler.AbstractYarnScheduler (AbstractYarnScheduler.java:setClusterMaxPriority(911))
- Updated the cluste max priority to maxClusterLevelAppPriority = 10
> 2017-12-27 10:52:30,066 INFO  resourcemanager.ResourceManager (ResourceManager.java:transitionToActive(1177))
- Transitioning to active state
> 2017-12-27 10:52:30,097 INFO  resourcemanager.ResourceManager (ResourceManager.java:serviceStart(765))
- Recovery started
> 2017-12-27 10:52:30,102 INFO  recovery.RMStateStore (RMStateStore.java:checkVersion(747))
- Loaded RM state version info 1.5
> 2017-12-27 10:52:30,375 INFO  security.RMDelegationTokenSecretManager (RMDelegationTokenSecretManager.java:recover(196))
- recovering RMDelegationTokenSecretManager.
> 2017-12-27 10:52:30,380 INFO  resourcemanager.RMAppManager (RMAppManager.java:recover(561))
- Recovering 51 applications
> 2017-12-27 10:52:30,432 INFO  resourcemanager.RMAppManager (RMAppManager.java:recover(571))
- Successfully recovered 0 out of 51 applications
> 2017-12-27 10:52:30,432 ERROR resourcemanager.ResourceManager (ResourceManager.java:serviceStart(776))
- Failed to load/recover state
> org.apache.hadoop.yarn.exceptions.YarnException: org.apache.hadoop.security.AccessControlException:
User hrt_qa (auth:SIMPLE) does not have permission to submit/update application_1514268754125_0001
for 0
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.checkAndGetApplicationPriority(CapacityScheduler.java:2348)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:396)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:358)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:567)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1390)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:771)
>         at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1143)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1183)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1179)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1179)
>         at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:320)
>         at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
>         at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:894)
>         at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:473)
>         at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:611)
>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510)
> Caused by: org.apache.hadoop.security.AccessControlException: User hrt_qa (auth:SIMPLE)
does not have permission to submit/update application_1514268754125_0001 for 0
>         ... 20 more
> 2017-12-27 10:52:30,434 INFO  service.AbstractService (AbstractService.java:noteFailure(273))
- Service RMActiveServices failed in state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnException:
org.apache.hadoop.security.AccessControlException: User hrt_qa (auth:SIMPLE) does not have
permission to submit/update application_1514268754125_0001 for 0
> org.apache.hadoop.yarn.exceptions.YarnException: org.apache.hadoop.security.AccessControlException:
User hrt_qa (auth:SIMPLE) does not have permission to submit/update application_1514268754125_0001
for 0
>         at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.checkAndGetApplicationPriority(CapacityScheduler.java:2348)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:396)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recoverApplication(RMAppManager.java:358)
>         at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.recover(RMAppManager.java:567)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.recover(ResourceManager.java:1390)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:771)
>         at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1143)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1183)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1179)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1179)
>         at org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:320)
>         at org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
>         at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:894)
>         at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:473)
>         at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:611)
>         at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:510)
> Caused by: org.apache.hadoop.security.AccessControlException: User hrt_qa (auth:SIMPLE)
does not have permission to submit/update application_1514268754125_0001 for 0
>         ... 20 more
> 2017-12-27 10:52:30,435 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(210))
- Stopping ResourceManager metrics system...
> 2017-12-27 10:52:30,435 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(216))
- ResourceManager metrics system stopped.
> 2017-12-27 10:52:30,436 INFO  impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(607))
- ResourceManager metrics system shutdown complete.
> 2017-12-27 10:52:30,436 INFO  event.AsyncDispatcher (AsyncDispatcher.java:serviceStop(155))
- AsyncDispatcher is draining to stop, ignoring any new events.
> 2017-12-27 10:52:30,437 INFO  event.AsyncDispatcher (AsyncDispatcher.java:register(223))
- Registering class org.apache.hadoop.yarn.server.resourcemanager.RMFatalEventType for class
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMFatalEventDispatcher
> 2017-12-27 10:52:30,438 INFO  security.NMTokenSecretManagerInRM (NMTokenSecretManagerInRM.java:<init>(75))
- NMTokenKeyRollingInterval: 86400000ms and NMTokenKeyActivationDelay: 900000ms
> 2017-12-27 10:52:30,438 INFO  security.RMContainerTokenSecretManager (RMContainerTokenSecretManager.java:<init>(79))
- ContainerTokenKeyRollingInterval: 86400000ms and ContainerTokenKeyActivationDelay: 900000ms
> 2017-12-27 10:52:30,438 INFO  security.AMRMTokenSecretManager (AMRMTokenSecretManager.java:<init>(94))
- AMRMTokenKeyRollingInterval: 86400000ms and AMRMTokenKeyActivationDelay: 900000 ms
> 2017-12-27 10:52:30,439 INFO  recovery.RMStateStoreFactory (RMStateStoreFactory.java:getStore(33))
- Using RMStateStore implementation - class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore
> 2017-12-27 10:52:30,439 INFO  event.AsyncDispatcher (AsyncDispatcher.java:register(223))
- Registering class org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStoreEventType
for class org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore$ForwardingEventHandler
> 2017-12-27 10:52:30,439 WARN  curator.CuratorZookeeperClient (CuratorZookeeperClient.java:<init>(96))
- session timeout [10000] is less than connection timeout [15000]
> 2017-12-27 10:52:30,440 INFO  imps.CuratorFrameworkImpl (CuratorFrameworkImpl.java:start(235))
- Starting
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message