Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id EB074200BDB for ; Mon, 12 Dec 2016 08:46:59 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id E9CB2160B22; Mon, 12 Dec 2016 07:46:59 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 638CD160B0D for ; Mon, 12 Dec 2016 08:46:59 +0100 (CET) Received: (qmail 75945 invoked by uid 500); 12 Dec 2016 07:46:58 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 75932 invoked by uid 99); 12 Dec 2016 07:46:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Dec 2016 07:46:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 559B22C03DD for ; Mon, 12 Dec 2016 07:46:58 +0000 (UTC) Date: Mon, 12 Dec 2016 07:46:58 +0000 (UTC) From: "Suresh Bahuguna (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-11234) distcp performance is suboptimal for high bandwidth/high latency setups MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Mon, 12 Dec 2016 07:47:00 -0000 [ https://issues.apache.org/jira/browse/HDFS-11234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15741245#comment-15741245 ] Suresh Bahuguna commented on HDFS-11234: ---------------------------------------- Created a pull request https://github.com/apache/hadoop/pull/172 . > distcp performance is suboptimal for high bandwidth/high latency setups > ----------------------------------------------------------------------- > > Key: HDFS-11234 > URL: https://issues.apache.org/jira/browse/HDFS-11234 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs > Affects Versions: 2.7.1 > Reporter: Suresh Bahuguna > > Because distcp uses tcp socket with buffer size set to 128K, for a setup which has very high bandwidth but also a very high latency, the throughput is quite poor. This is because tcp stops sending more data till the time it gets the ACKs. By not setting the socket size and letting linux kernel manage the socket, we should be able to get optimal performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org