hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10203) Excessive topology lookup for large number of client machines in namenode
Date Thu, 24 Mar 2016 06:15:25 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15209837#comment-15209837

Ming Ma commented on HDFS-10203:

Couple ideas we talked about:

* Use  {{TableMapping}} or better version of {{ScriptBasedMapping}} that can skip the launch
of script command for any non-datanode client machines.
* Add a new configuration in NN to control if DatanodeManager should skip the topology resolution
for any non-datanode client machines, regardless which {{DNSToSwitchMapping}} class is used.
That won't work if YARN node managers aren't on the same machines as datanodes.
* Have DFSClient computes its client machine's network location and pass it to NN via a new
 {{ClientProtocol#getBlockLocations}} method so that NN no longer needs to do topology resolution
for client machines.
* Have NN pass unsorted datanode list in {{ClientProtocol#getBlockLocations}} and have DFSClient
sort that list by network distance, given we have done some work in https://issues.apache.org/jira/browse/HDFS-9579.
For compatibility, we might need a new method given we change the semantics from sorted datanodes
to unsorted datanodes; in that way, old DFSClient and new NN can still achieve the read-from-closer-datanode


> Excessive topology lookup for large number of client machines in namenode
> -------------------------------------------------------------------------
>                 Key: HDFS-10203
>                 URL: https://issues.apache.org/jira/browse/HDFS-10203
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Ming Ma
> In the {{ClientProtocol#getBlockLocations}} call, DatanodeManager computes the network
distance between the client machine and the datanodes. As part of that, it needs to resolve
the network location of the client machine. If the client machine isn't a datanode, it needs
to ask {{DNSToSwitchMapping}} to resolve it.
> {noformat}
>   public void sortLocatedBlocks(final String targethost,
>       final List<LocatedBlock> locatedblocks) {
>     //sort the blocks
>     // As it is possible for the separation of node manager and datanode, 
>     // here we should get node but not datanode only .
>     Node client = getDatanodeByHost(targethost);
>     if (client == null) {
>       List<String> hosts = new ArrayList<> (1);
>       hosts.add(targethost);
>       List<String> resolvedHosts = dnsToSwitchMapping.resolve(hosts);
>       if (resolvedHosts != null && !resolvedHosts.isEmpty()) {
>       ....
>       }
>     }
>   }
> {noformat}
> When there are ten of thousands of non-datanode client machines hitting the namenode
which uses {{ScriptBasedMapping}}, it causes the following issues:
> * After namenode startup, {{CachedDNSToSwitchMapping}} only has datanodes in the cache.
Calls from many different client machines means lots of process fork to run the topology script
and can cause spike in namenode load.
> * Cache size of {{CachedDNSToSwitchMapping}} can grow large. Under normal workload  say
< 100k client machines and each entry in the cache uses < 200 bytes, it will take up
to 20MB, not much compared to NN's heap size. But in theory it can still blow up NN if there
is misconfiguration or malicious attack with millions of IP addresses.

This message was sent by Atlassian JIRA

View raw message