hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9379) Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes
Date Tue, 17 Nov 2015 21:48:11 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15009604#comment-15009604

Konstantin Shvachko commented on HDFS-9379:

Yes, the test for NNThroughputBenchmark uses default values for all operations. So changing
the default number of threads for {{BlockReportStats}} will make this case tested.
For actual benchmarking the defaults are rarely used. And if you do, you clearly see all the
parameters in the output and can adjust whatever is needed.

> Make NNThroughputBenchmark$BlockReportStats support more than 10 datanodes
> --------------------------------------------------------------------------
>                 Key: HDFS-9379
>                 URL: https://issues.apache.org/jira/browse/HDFS-9379
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: test
>            Reporter: Mingliang Liu
>            Assignee: Mingliang Liu
>             Fix For: 2.8.0
>         Attachments: HDFS-9379.000.patch
> Currently, the {{NNThroughputBenchmark}} test {{BlockReportStats}} relies on sorted {{datanodes}}
array in the lexicographical order of datanode's {{xferAddr}}.
> * There is an assertion of datanode's {{xferAddr}} lexicographical order when filling
the {{datanodes}}, see [the code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1152].
> * When searching the datanode by {{DatanodeInfo}}, it uses binary search against the
{{datanodes}} array, see [the code|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NNThroughputBenchmark.java#L1187]
> In {{DatanodeID}}, the {{xferAddr}} is defined as {{host:port}}. In {{NNThroughputBenchmark}},
the port is simply _the index of the tiny datanode_ plus one.
> The problem here is that, when there are more than 9 tiny datanodes ({{numThreads}}),
the lexicographical order of datanode's {{xferAddr}} will be invalid as the string value of
datanode index is not in lexicographical order any more. For example, 
> {code}
> ...
> ...
> {code}
> {{}} is greater than {{}}. The assertion will fail and
the binary search won't work.
> The simple fix is to calculate the datanode index by port directly, instead of using
binary search.

This message was sent by Atlassian JIRA

View raw message