hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Xiaoqiao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes
Date Fri, 22 Feb 2019 09:21:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774934#comment-16774934

He Xiaoqiao commented on HDFS-14305:

hi [~csun], I think this issue triggered only after HDFS-6440. Before that, it is work well
in HA cluster with 2 NameNodes (based on branch-2.7). Check {{serialNo}} NO. scope and shows
as following and no overlap between 2 namenodes:
{quote}nnIndex=0: [0, 2147483647]
 nnIndex=1: [-2147483648, -1]
HDFS-6440 used {{intRange}} + {{nnRangeStart}} replace {{nnIndex}}, and only distributed positive
integer to different namenodes, but when initialize serialNo it could be negtive integer since
invoke {{new SecureRandom().nextInt()}}, and cause serialno overlap between different namenodes
in same namespace. In one words, the root cause is {{SecureRandom().nextInt()}}.
 I propose to use only positive integer as serialNo of BlockTokenSecretManager to avoid this
issue. FYI.

> Serial number in BlockTokenSecretManager could overlap between different namenodes
> ----------------------------------------------------------------------------------
>                 Key: HDFS-14305
>                 URL: https://issues.apache.org/jira/browse/HDFS-14305
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: security
>            Reporter: Chao Sun
>            Assignee: Chao Sun
>            Priority: Major
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the initial
serial number, and then use this formula to rotate it:
> {code:java}
>     this.intRange = Integer.MAX_VALUE / numNNs;
>     this.nnRangeStart = intRange * nnIndex;
>     this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and {{nnIndex}} is
the index of the current NameNode specified in the configuration {{dfs.ha.namenodes.<nameservice>}}.
> However, with this approach, different NameNode could have overlapping ranges for serial
number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, and we have 2 NameNodes
{{nn1}} and {{nn2}} in configuration. Then the ranges for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated with the
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to a different
NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which will cause
clients to fail because of {{InvalidToken}} error.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message