hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wilfred Spiegelenburg (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8865) RMStateStore contains large number of expired RMDelegationToken
Date Thu, 11 Oct 2018 15:32:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16646622#comment-16646622

Wilfred Spiegelenburg commented on YARN-8865:

I am not sure what has happened in the environment or even if the two cleanup times were set
differently (key and token have their own interval). I just have the zookeeper DB to work
with no logs from that time frame.

The ADTSM method {{addPersistedDelegationToken}} has a safe guard already: the secret manager
cannot be running at the time we restore. That removes a lot of the problem. The other side
(specifically for the NN) HDFS uses its own version of {{addPersistedDelegationToken}}. It
has its own implementation in DelegationTokenSecretManager (defined in org.apache.hadoop.hdfs.security.token.delegation).
The HDFS side should thus not be affected by the change.
The other three uses are YARN RM, YARN ATS and MR JHS. Based on what I can see none of them
have an issue.

If the change is still considered too risky I think the option to still add them with a null
password is the best solution.

> RMStateStore contains large number of expired RMDelegationToken
> ---------------------------------------------------------------
>                 Key: YARN-8865
>                 URL: https://issues.apache.org/jira/browse/YARN-8865
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 3.1.0
>            Reporter: Wilfred Spiegelenburg
>            Assignee: Wilfred Spiegelenburg
>            Priority: Major
>         Attachments: YARN-8865.001.patch, YARN-8865.002.patch
> When the RM state store is restored expired delegation tokens are restored and added
to the system. These expired tokens do not get cleaned up or removed. The exact reason why
the tokens are still in the store is not clear. We have seen as many as 250,000 tokens in
the store some of which were 2 years old.
> This has two side effects:
> * for the zookeeper store this leads to a jute buffer exhaustion issue and prevents the
RM from becoming active.
> * restore takes longer than needed and heap usage is higher than it should be
> We should not restore already expired tokens since they cannot be renewed or used.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message