Return-Path: X-Original-To: apmail-tez-issues-archive@minotaur.apache.org Delivered-To: apmail-tez-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D701017C93 for ; Fri, 10 Jul 2015 13:07:04 +0000 (UTC) Received: (qmail 24640 invoked by uid 500); 10 Jul 2015 13:07:04 -0000 Delivered-To: apmail-tez-issues-archive@tez.apache.org Received: (qmail 24588 invoked by uid 500); 10 Jul 2015 13:07:04 -0000 Mailing-List: contact issues-help@tez.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@tez.apache.org Delivered-To: mailing list issues@tez.apache.org Received: (qmail 24579 invoked by uid 99); 10 Jul 2015 13:07:04 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Jul 2015 13:07:04 +0000 Date: Fri, 10 Jul 2015 13:07:04 +0000 (UTC) From: "Rajesh Balamohan (JIRA)" To: issues@tez.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (TEZ-2612) Critical path analyzer for DAGs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Rajesh Balamohan created TEZ-2612: ------------------------------------- Summary: Critical path analyzer for DAGs Key: TEZ-2612 URL: https://issues.apache.org/jira/browse/TEZ-2612 Project: Apache Tez Issue Type: Bug Reporter: Rajesh Balamohan This analyzer plugin/tool can be used to understand the important vertices/tasks of interest in large DAG for perf analysis / finding bottlenecks. It can be used to find out 1.input dependency, 2. failure dependency, 3. scheduling dependency (may be at later stage). Creating this as a uber ticket. Getting this detail at vertex level might be possible with the existing logs derived from ATS. For task level analysis, certain more details are required. 1. Timeline details like when fetch/merge/compute/sort etc are not captured now. These details can possibly be added in TaskCompletionEvent 2. Need additional details like the last event that completed processing in the input (for tracing at task level) 3. Add downstream task attempt that caused the higher level task to get rescheduled/restarted. This can be used in terms of understanding in cases where the task failed due to read-error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)