From issues-return-40198-archive-asf-public=cust-asf.ponee.io@tez.apache.org Fri Jan 4 22:24:04 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id B99D218067E for ; Fri, 4 Jan 2019 22:24:03 +0100 (CET) Received: (qmail 78137 invoked by uid 500); 4 Jan 2019 21:24:02 -0000 Mailing-List: contact issues-help@tez.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@tez.apache.org Delivered-To: mailing list issues@tez.apache.org Received: (qmail 77944 invoked by uid 99); 4 Jan 2019 21:24:02 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Jan 2019 21:24:02 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 72421C5ED2 for ; Fri, 4 Jan 2019 21:24:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -110.301 X-Spam-Level: X-Spam-Status: No, score=-110.301 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id ZXT-kJYLDZrf for ; Fri, 4 Jan 2019 21:24:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 4E31C5F3CE for ; Fri, 4 Jan 2019 21:24:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id D3738E263E for ; Fri, 4 Jan 2019 21:24:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 4A70B25440 for ; Fri, 4 Jan 2019 21:24:00 +0000 (UTC) Date: Fri, 4 Jan 2019 21:24:00 +0000 (UTC) From: "Jason Lowe (JIRA)" To: issues@tez.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (TEZ-394) Better scheduling for uneven DAGs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/TEZ-394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16734580#comment-16734580 ] Jason Lowe commented on TEZ-394: -------------------------------- Attaching a new patch that makes this behavior configurable and disabled by default. This avoids the bad preemption behavior that Gopal encountered when running with the default YARN task scheduler but allows users to enable it in conjuction with a DAG-aware task scheduler like DagAwareYarnTaskScheduler. > Better scheduling for uneven DAGs > --------------------------------- > > Key: TEZ-394 > URL: https://issues.apache.org/jira/browse/TEZ-394 > Project: Apache Tez > Issue Type: Sub-task > Reporter: Rohini Palaniswamy > Assignee: Jason Lowe > Priority: Major > Attachments: TEZ-394.001.patch, TEZ-394.002.patch, TEZ-394.003.patch, TEZ-394.004.patch > > > Consider a series of joins or group by on dataset A with few datasets that takes 10 hours followed by a final join with a dataset X. The vertex that loads dataset X will be one of the top vertexes and initialized early even though its output is not consumed till the end after 10 hours. > 1) Could either use delayed start logic for better resource allocation > 2) Else if they are started upfront, need to handle failure/recovery cases where the nodes which executed the MapTask might have gone down when the final join happens. -- This message was sent by Atlassian JIRA (v7.6.3#76005)