From issues-return-39913-apmail-tez-issues-archive=tez.apache.org@tez.apache.org Wed Oct 10 16:58:03 2018 Return-Path: X-Original-To: apmail-tez-issues-archive@minotaur.apache.org Delivered-To: apmail-tez-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7686B18264 for ; Wed, 10 Oct 2018 16:58:03 +0000 (UTC) Received: (qmail 66912 invoked by uid 500); 10 Oct 2018 16:58:03 -0000 Delivered-To: apmail-tez-issues-archive@tez.apache.org Received: (qmail 66876 invoked by uid 500); 10 Oct 2018 16:58:03 -0000 Mailing-List: contact issues-help@tez.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@tez.apache.org Delivered-To: mailing list issues@tez.apache.org Received: (qmail 66867 invoked by uid 99); 10 Oct 2018 16:58:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Oct 2018 16:58:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 0779F1A20A6 for ; Wed, 10 Oct 2018 16:58:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.501 X-Spam-Level: X-Spam-Status: No, score=-109.501 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id V_IX9MwAa0mF for ; Wed, 10 Oct 2018 16:58:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 6F2955F4ED for ; Wed, 10 Oct 2018 16:58:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 97EA1E0F4C for ; Wed, 10 Oct 2018 16:58:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 4A05024D23 for ; Wed, 10 Oct 2018 16:58:00 +0000 (UTC) Date: Wed, 10 Oct 2018 16:58:00 +0000 (UTC) From: "Kuhu Shukla (JIRA)" To: issues@tez.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (TEZ-3990) The number of shuffle penalties for a host/inputAttemptIdentifier should be capped MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/TEZ-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16645254#comment-16645254 ] Kuhu Shukla commented on TEZ-3990: ---------------------------------- Missed a variable name change. d'oh. > The number of shuffle penalties for a host/inputAttemptIdentifier should be capped > ---------------------------------------------------------------------------------- > > Key: TEZ-3990 > URL: https://issues.apache.org/jira/browse/TEZ-3990 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.9.1, 0.10.0 > Reporter: Kuhu Shukla > Assignee: Kuhu Shukla > Priority: Major > Attachments: TEZ-3990.001.patch, TEZ-3990.002.patch, TEZ-3990.003.patch, TEZ-3990.004.patch, TEZ-3990.005.patch, TEZ-3990.006..patch > > > In a scenario where the same mapId fetches fail, the penalty code allows adding the same Host/InputAttemptIdentifier over and over with revised penalty time that grows exponentially. It should at some point drop the retrying and report failure to the AM asap to allow the job to rectify the upstream output. -- This message was sent by Atlassian JIRA (v7.6.3#76005)