hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen Liang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-11535) Performance analysis of new DFSNetworkTopology#chooseRandom
Date Wed, 15 Mar 2017 19:54:41 GMT
Chen Liang created HDFS-11535:
---------------------------------

             Summary: Performance analysis of new DFSNetworkTopology#chooseRandom
                 Key: HDFS-11535
                 URL: https://issues.apache.org/jira/browse/HDFS-11535
             Project: Hadoop HDFS
          Issue Type: Sub-task
          Components: namenode
            Reporter: Chen Liang
            Assignee: Chen Liang
         Attachments: PerfTest.pdf

This JIRA is created to post the results of some performance experiments we did.  For those
who are interested, please the attached .pdf file for more detail. The attached patch file
includes the experiment code we ran. 

The key insights we got from these tests is that: although *the new method outperforms the
current one in most cases*. There is still *one case where the current one is better*. Which
is when there is only one storage type in the cluster, and we also always look for this storage
type. In this case, it is simply a waste of time to perform storage-type-based pruning, blindly
picking up a random node (current methods) would suffice.

Therefore, based on the analysis, we propose to use a *combination of both the old and the
new methods*:

say, we search for a node of type X, since now inner node all keep storage type info, we can
*just check root node to see if X is the only type it has*. If yes, blindly picking a random
leaf will work, so we simply call the old method, otherwise we call the new method.

There is still at least one missing piece in this performance test, which is garbage collection.
The new method does a few more object creation when doing the search, which adds overhead
to GC. I'm still thinking of any potential optimization but this seems tricky, also I'm not
sure whether this optimization worth doing at all. Please feel free to leave any comments/suggestions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org


Mime
View raw message