Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C7A6E200C08 for ; Thu, 12 Jan 2017 00:15:18 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id C6535160B51; Wed, 11 Jan 2017 23:15:18 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 4CCD8160B4E for ; Thu, 12 Jan 2017 00:15:18 +0100 (CET) Received: (qmail 7303 invoked by uid 500); 11 Jan 2017 23:15:17 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 6951 invoked by uid 99); 11 Jan 2017 23:15:17 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Jan 2017 23:15:17 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id F29A82C03E3 for ; Wed, 11 Jan 2017 23:15:16 +0000 (UTC) Date: Wed, 11 Jan 2017 23:15:16 +0000 (UTC) From: "Zheng Shao (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HADOOP-13975) Allow DistCp to use MultiThreadedMapper MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 11 Jan 2017 23:15:19 -0000 [ https://issues.apache.org/jira/browse/HADOOP-13975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Shao updated HADOOP-13975: -------------------------------- Attachment: HADOOP-distcp-multithreaded-mapper-trunk.1.patch > Allow DistCp to use MultiThreadedMapper > --------------------------------------- > > Key: HADOOP-13975 > URL: https://issues.apache.org/jira/browse/HADOOP-13975 > Project: Hadoop Common > Issue Type: New Feature > Components: tools/distcp > Affects Versions: 3.0.0-alpha1 > Reporter: Zheng Shao > Assignee: Zheng Shao > Priority: Minor > Attachments: HADOOP-distcp-multithreaded-mapper-branch26.1.patch, HADOOP-distcp-multithreaded-mapper-trunk.1.patch > > > Although distcp allow users to control the parallelism via number of mappers, sometimes it's desirable to run fewer mappers but more threads per mapper. Since distcp is network bound (either by throughput or more frequently by latency of creating connections, opening files, reading/writing files, and closing files), this can make each mapper much more efficient. > In that way, a lot of resources can be shared so we can save memory and connections to NameNode. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: common-issues-help@hadoop.apache.org