hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Bahuguna (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-11234) distcp performance is suboptimal for high bandwidth/high latency setups
Date Mon, 12 Dec 2016 05:59:58 GMT
Suresh Bahuguna created HDFS-11234:
--------------------------------------

             Summary: distcp performance is suboptimal for high bandwidth/high latency setups
                 Key: HDFS-11234
                 URL: https://issues.apache.org/jira/browse/HDFS-11234
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: hdfs
    Affects Versions: 2.7.1
            Reporter: Suresh Bahuguna


Because distcp uses tcp socket with buffer size set to 128K, for a setup which has very high
bandwidth but also a very high latency, the throughput is quite poor. This is because tcp
stops sending more data till the time it gets the ACKs. By not setting the socket size and
letting linux kernel manage the socket, we should be able to get optimal performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org


Mime
View raw message