Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D81E61958A for ; Fri, 15 Apr 2016 18:17:26 +0000 (UTC) Received: (qmail 21133 invoked by uid 500); 15 Apr 2016 18:17:25 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 21077 invoked by uid 500); 15 Apr 2016 18:17:25 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 21063 invoked by uid 99); 15 Apr 2016 18:17:25 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Apr 2016 18:17:25 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 7C03A2C1F60 for ; Fri, 15 Apr 2016 18:17:25 +0000 (UTC) Date: Fri, 15 Apr 2016 18:17:25 +0000 (UTC) From: "Nathan Roberts (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (YARN-4964) Allow ShuffleHandler readahead without drop-behind MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Roberts updated YARN-4964: --------------------------------- Attachment: YARN-4964.001.patch > Allow ShuffleHandler readahead without drop-behind > -------------------------------------------------- > > Key: YARN-4964 > URL: https://issues.apache.org/jira/browse/YARN-4964 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager > Affects Versions: 3.0.0, 2.7.2 > Reporter: Nathan Roberts > Assignee: Nathan Roberts > Attachments: YARN-4964.001.patch > > > Currently mapreduce.shuffle.manage.os.cache enables/disables both readahead (POSIX_FADV_WILLNEED) and drop-behind (POSIX_FADV_DONTNEED) logic within the ShuffleHandler. > It would be beneficial if these were separately configurable. > - Running without readahead can lead to significant seek storms caused by large numbers of sendfiles() competing with one another. > - However, running with drop-behind can also lead to seek storms because there are cases where the server can successfully write the shuffle bytes to the network, BUT the client doesn't want the bytes right now (MergeManager wants to WAIT is an example) so it ignores them and asks for them again a bit later. This causes repeated reads of the same data from disk. > I'll attach a simple patch that enables/disables readahead based on mapreduce.shuffle.readahead.bytes==0, leaving mapreduce.shuffle.manage.os.cache controlling only the drop-behind. -- This message was sent by Atlassian JIRA (v6.3.4#6332)