hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Kanter (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-7262) Add a hierarchy into the ZKRMStateStore for delegation token znodes to prevent jute buffer overflow
Date Thu, 28 Sep 2017 01:28:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Kanter updated YARN-7262:
--------------------------------
    Attachment: YARN-7262.001.patch

The patch adds the ability to configure a hierarchy like that in YARN-2962.  I generalized
and reused code from YARN-2962 when possible; otherwise, I tried to mirror the YARN-2962 code.
 There are two big differences:
# The app znodes in YARN-2962 had children (for app attempts), which we don't have to worry
about here because delegation token znodes don't have children.
# YARN-2962 adds an extra level named "HIERARCHIES" that doesn't seem to be necessary.  The
token znode path is already quite long, so I omitted that.  The layout looks like this:
{noformat}
 * |--- RM_DT_SECRET_MANAGER_ROOT
 *        |----- RM_DT_SEQUENTIAL_NUMBER_ZNODE_NAME
 *        |----- RM_DELEGATION_TOKENS_ROOT_ZNODE_NAME
 *        |       |----- 1
 *        |       |      |----- (#TokenId barring last character)
 *        |       |      |       |----- (#Last character of TokenId)
 *        |       |      ....
 *        |       |----- 2
 *        |       |      |----- (#TokenId barring last 2 characters)
 *        |       |      |       |----- (#Last 2 characters of TokenId)
 *        |       |      ....
 *        |       |----- 3
 *        |       |      |----- (#TokenId barring last 3 characters)
 *        |       |      |       |----- (#Last 3 characters of TokenId)
 *        |       |      ....
 *        |       |----- 4
 *        |       |      |----- (#TokenId barring last 4 characters)
 *        |       |      |       |----- (#Last 4 characters of TokenId)
 *        |       |      ....
 *        |       |----- Token_1
 *        |       |----- Token_2
 *        |       ....
{noformat}
YARN-2962 had "HIERARCHIES" next to "Token_#" with "1", "2", "3", and "4" under it.  Here,
we just put "1", "2", "3", and "4" next to "Token_#".

Some more useful info about the patch:
- The default behavior is to use a flat layout, like before.  {{yarn.resourcemanager.zk-delegation-token-node.split-index}}
can be set to {{0}}, {{1}}, {{2}}, {{3}}, or {{4}} to split on the last 1, 2, 3, or 4 digits
of the token sequence number.
- Token sequence numbers start at {{0}} and have a variable width, unlike Application IDs
which have a width of 4, so when naming their znodes, the code pads them to at least 4 digits.
 For example, {{RMDelegationToken_5}} becomes {{RMDelegationToken_0005}}.  This ensures that
the index splitting works correctly.  The exception to this is when using a flat layout so
we maintain the names as before.
- When looking for a delegation token znode, it will first try with the current value of {{yarn.resourcemanager.zk-delegation-token-node.split-index}},
but it will fallback to looking at the other possible znode paths in case the token was created
when {{yarn.resourcemanager.zk-delegation-token-node.split-index}} had been set to a different
value.  This ensures we don't lose any tokens when {{yarn.resourcemanager.zk-delegation-token-node.split-index}}
changes.
- I haven't had a chance to try it out in an actual cluster yet, but there are unit tests
that show it working correctly.  In the meantime, we can still start reviews.

> Add a hierarchy into the ZKRMStateStore for delegation token znodes to prevent jute buffer
overflow
> ---------------------------------------------------------------------------------------------------
>
>                 Key: YARN-7262
>                 URL: https://issues.apache.org/jira/browse/YARN-7262
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 2.6.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: YARN-7262.001.patch
>
>
> We've seen users who are running into a problem where the RM is storing so many delegation
tokens in the {{ZKRMStateStore}} that the _listing_ of those znodes is higher than the jute
buffer. This is fine during operations, but becomes a problem on a fail over because the RM
will try to read in all of the token znodes (i.e. call {{getChildren}} on the parent znode).
 This is particularly bad because everything appears to be okay, but then if a failover occurs
you end up with no active RMs.
> There was a similar problem with the Yarn application data that was fixed in YARN-2962
by adding a (configurable) hierarchy of znodes so the RM could pull subchildren without overflowing
the jute buffer (though it's off by default).
> We should add a hierarchy similar to that of YARN-2962, but for the delegation token
znodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message