hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Saxena (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3779) Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure cluster
Date Sat, 20 Jun 2015 11:51:02 GMT

    [ https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594551#comment-14594551
] 

Varun Saxena commented on YARN-3779:
------------------------------------

Added a patch and submitted it, fixing both cases. This JIRA should move to MAPREDUCE. But
not moving it because not sure if Jenkins will be able to post results of the submitted patch
then

> Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings in secure
cluster
> ----------------------------------------------------------------------------------------------
>
>                 Key: YARN-3779
>                 URL: https://issues.apache.org/jira/browse/YARN-3779
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.7.0
>         Environment: mrV2, secure mode
>            Reporter: Zhang Wei
>            Assignee: Varun Saxena
>            Priority: Critical
>         Attachments: YARN-3779.01.patch, YARN-3779.02.patch, YARN-3779.03.patch, log_aggr_deletion_on_refresh_error.log,
log_aggr_deletion_on_refresh_fix.log
>
>
> {{GSSException}} is thrown everytime log aggregation deletion is attempted after executing
bin/mapred hsadmin -refreshLogRetentionSettings in a secure cluster.
> The problem can be reproduced by following steps:
> 1. startup historyserver in secure cluster.
> 2. Log deletion happens as per expectation. 
> 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh the configuration
value.
> 4. All the subsequent attempts of log deletion fail with {{GSSException}}
> Following exception can be found in historyserver's log if log deletion is enabled. 
> {noformat}
> 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this deletion
attempt is being aborted | AggregatedLogDeletionService.java:127
> java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException:
GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level:
Failed to find any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; destination
host is: "vm-33":25000; 
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1414)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1363)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at com.sun.proxy.$Proxy9.getListing(Unknown Source)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519)
>         at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy10.getListing(Unknown Source)
>         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767)
>         at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749)
>         at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749)
>         at org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68)
>         at java.util.TimerThread.mainLoop(Timer.java:555)
>         at java.util.TimerThread.run(Timer.java:505)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed
[Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any
Kerberos tgt)]
>         at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
>         at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640)
>         at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724)
>         at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1381)
>         ... 21 more
> Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException:
No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
>         at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
>         at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:411)
>         at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:550)
>         at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:367)
>         at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:716)
>         at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:712)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641)
>         at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:711)
>         ... 24 more
> Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find
any Kerberos tgt)
>         at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
>         at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
>         at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
>         at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
>         at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
>         at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
>         at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
>         ... 33 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message