ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Puviarasu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (IGNITE-8890) Ignite YARN Kerberos - Delegation Token renewal
Date Wed, 27 Jun 2018 13:25:00 GMT

     [ https://issues.apache.org/jira/browse/IGNITE-8890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Puviarasu updated IGNITE-8890:
------------------------------
    Description: 
As Ignite-YARN is a long running application in YARN environment it should have a mechanism
to renew the delegation token.

In Ignite-YARN, when the ApplicationMaster is started, it acquires Delegation tokens and stores
in a ByteBuffer[Class: ApplicationMaster, Method: init()].


 This ByteBuffer with token information is given to all the containers received from ResourceManager
[Class: ApplicationMaster, Method: onContainersAllocated()]. 
 Everything works fine till the life time of the delegation token. 

Once the delegation token expires, the ApplicationMaster is not able to start Ignite inside
containers it receive and below exception occurs

*WARNING: Error launching container* 
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager*$InvalidToken*)
:
 at org.apache.hadoop.ipc.Client.call(Client.java:1504)
 at org.apache.hadoop.ipc.Client.call(Client.java:1441)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
 at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
 at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
 at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
 at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
 at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
 at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2123)
 at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1253)
 at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1249)
 at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1249)
 at org.apache.ignite.yarn.utils.IgniteYarnUtils.setupFile(IgniteYarnUtils.java:65)
 at org.apache.ignite.yarn.ApplicationMaster.onContainersAllocated(ApplicationMaster.java:131)
 at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:292)

ApplicationMaster keeps on asking for more and more containers [Class: ApplicationMaster,
Method: run()] but not able to start Ignite inside any of the containers due to the expired/missing
delegation token. The failed containers are not released when Exception occurs.


 *This repeats until all the resources in the cluster are allocated to Ignition. As a result
of this Ignition uses all resources in the cluster and no other jobs were able to run.*  

Kindly help in resolving the issue.

Thanks in Advance!!!

 

  was:
As Ignite-YARN is a long running application in YARN environment it should have a mechanism
to renew the delegation token. 

In Ignite-YARN, when the ApplicationMaster is started, it acquires Delegation tokens and stores
in a ByteBuffer[Class: ApplicationMaster, Method: init()]. 
This ByteBuffer with token information is given to all the containers received from ResourceManager
[Class: ApplicationMaster, Method: onContainersAllocated()]. 
Everything works fine till the life time of the delegation token. 
Once the delegation token expires, the ApplicationMaster is not able to start Ignite inside
containers it receive and below exception occurs 

WARNING: Error launching container 
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):

at org.apache.hadoop.ipc.Client.call(Client.java:1504)
at org.apache.hadoop.ipc.Client.call(Client.java:1441)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2123)
at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1253)
at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1249)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1249)
at org.apache.ignite.yarn.utils.IgniteYarnUtils.setupFile(IgniteYarnUtils.java:65)
at org.apache.ignite.yarn.ApplicationMaster.onContainersAllocated(ApplicationMaster.java:131)
at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:292)

ApplicationMaster keeps on asking for more and more containers [Class: ApplicationMaster,
Method: run()] but not able to start Ignite inside any of the containers due to the expired/missing
delegation token. 
This repeats until all the resources in the cluster are allocated to Ignition.

Kindly help in resolving the issue.

Thanks in Advance!!!

 


> Ignite YARN Kerberos - Delegation Token renewal
> -----------------------------------------------
>
>                 Key: IGNITE-8890
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8890
>             Project: Ignite
>          Issue Type: Bug
>          Components: yarn
>    Affects Versions: 2.3
>         Environment: Kerberos cluster
> Ignite Version : 2.3.0
> Module : Ignite-YARN
> Class : ApplicationMaster
>  
>            Reporter: Puviarasu
>            Priority: Blocker
>
> As Ignite-YARN is a long running application in YARN environment it should have a mechanism
to renew the delegation token.
> In Ignite-YARN, when the ApplicationMaster is started, it acquires Delegation tokens
and stores in a ByteBuffer[Class: ApplicationMaster, Method: init()].
>  This ByteBuffer with token information is given to all the containers received from
ResourceManager [Class: ApplicationMaster, Method: onContainersAllocated()]. 
>  Everything works fine till the life time of the delegation token. 
> Once the delegation token expires, the ApplicationMaster is not able to start Ignite
inside containers it receive and below exception occurs
> *WARNING: Error launching container* 
>  org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager*$InvalidToken*)
:
>  at org.apache.hadoop.ipc.Client.call(Client.java:1504)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1441)
>  at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>  at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
>  at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
>  at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
>  at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>  at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
>  at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2123)
>  at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1253)
>  at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1249)
>  at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1249)
>  at org.apache.ignite.yarn.utils.IgniteYarnUtils.setupFile(IgniteYarnUtils.java:65)
>  at org.apache.ignite.yarn.ApplicationMaster.onContainersAllocated(ApplicationMaster.java:131)
>  at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:292)
> ApplicationMaster keeps on asking for more and more containers [Class: ApplicationMaster,
Method: run()] but not able to start Ignite inside any of the containers due to the expired/missing
delegation token. The failed containers are not released when Exception occurs.
>  *This repeats until all the resources in the cluster are allocated to Ignition. As a
result of this Ignition uses all resources in the cluster and no other jobs were able to run.*  
> Kindly help in resolving the issue.
> Thanks in Advance!!!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message