Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A306C17C5B for ; Wed, 22 Oct 2014 14:38:40 +0000 (UTC) Received: (qmail 27421 invoked by uid 500); 22 Oct 2014 14:38:35 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 27353 invoked by uid 500); 22 Oct 2014 14:38:35 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 27303 invoked by uid 99); 22 Oct 2014 14:38:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Oct 2014 14:38:35 +0000 Date: Wed, 22 Oct 2014 14:38:34 +0000 (UTC) From: "cntic (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (YARN-2681) Support bandwidth enforcement for containers while reading from HDFS MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cntic updated YARN-2681: ------------------------ Description: To read/write data from HDFS on data node, applications establise TCP/IP connections with the datanode. The HDFS read can be controled by setting Linux Traffic Control (TC) subsystem on the data node to make filters on appropriate connections. The current cgroups net_cls concept can not be applied on the node where the container is launched, netheir on data node since: - TC hanldes outgoing bandwidth only, so it can be set on container node (HDFS read = incoming data for the container) - Since HDFS data node is handled by only one process, it is not possible to use net_cls to separate connections from different containers to the datanode. Tasks: 1) Extend Resource model to define bandwidth enforcement rate 2) Monitor TCP/IP connection estabilised by container handling process and its child processes 3) Set Linux Traffic Control rules on data node base on address:port pairs in order to enforce bandwidth of outgoing data Concept: http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf was: To read/write data from HDFS on data node, applications establise TCP/IP connections with the datanode. The HDFS read can be controled by setting Linux Traffic Control (TC) subsystem on the data node to make filters on appropriate connections. The current cgroups net_cls concept can not be applied on the node where the container is launched, netheir on data node since: - TC hanldes outgoing bandwidth only, so it can be set on container node (HDFS read = incoming data for the container) - Since HDFS data node is handled by only one process, it is not possible to use net_cls to separate connections from different containers to the datanode. Tasks: 1) Extend Resource model to define bandwidth enforcement rate 2) Monitor TCP/IP connection estabilised by container handling process and its child processes 3) Set Linux Traffic Control rules on data node base on address:port pairs in order to enforce bandwidth of outgoing data > Support bandwidth enforcement for containers while reading from HDFS > -------------------------------------------------------------------- > > Key: YARN-2681 > URL: https://issues.apache.org/jira/browse/YARN-2681 > Project: Hadoop YARN > Issue Type: New Feature > Components: capacityscheduler, nodemanager, resourcemanager > Affects Versions: 2.5.1 > Environment: Linux > Reporter: cntic > Attachments: HADOOP-2681.patch, Traffic Control Design.png > > > To read/write data from HDFS on data node, applications establise TCP/IP connections with the datanode. The HDFS read can be controled by setting Linux Traffic Control (TC) subsystem on the data node to make filters on appropriate connections. > The current cgroups net_cls concept can not be applied on the node where the container is launched, netheir on data node since: > - TC hanldes outgoing bandwidth only, so it can be set on container node (HDFS read = incoming data for the container) > - Since HDFS data node is handled by only one process, it is not possible to use net_cls to separate connections from different containers to the datanode. > Tasks: > 1) Extend Resource model to define bandwidth enforcement rate > 2) Monitor TCP/IP connection estabilised by container handling process and its child processes > 3) Set Linux Traffic Control rules on data node base on address:port pairs in order to enforce bandwidth of outgoing data > Concept: > http://www.hit.bme.hu/~do/papers/EnforcementDesign.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)