Return-Path: X-Original-To: apmail-tez-issues-archive@minotaur.apache.org Delivered-To: apmail-tez-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1250D176DF for ; Fri, 31 Oct 2014 05:41:34 +0000 (UTC) Received: (qmail 7591 invoked by uid 500); 31 Oct 2014 05:41:34 -0000 Delivered-To: apmail-tez-issues-archive@tez.apache.org Received: (qmail 7529 invoked by uid 500); 31 Oct 2014 05:41:33 -0000 Mailing-List: contact issues-help@tez.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@tez.apache.org Delivered-To: mailing list issues@tez.apache.org Received: (qmail 7519 invoked by uid 99); 31 Oct 2014 05:41:33 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Oct 2014 05:41:33 +0000 Date: Fri, 31 Oct 2014 05:41:33 +0000 (UTC) From: "Jeff Zhang (JIRA)" To: issues@tez.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (TEZ-1703) Exception handling for InputInitializer MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/TEZ-1703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14191404#comment-14191404 ] Jeff Zhang commented on TEZ-1703: --------------------------------- [~sseth] The origin patch has one race issue in test case. Previously I'd like simulate the behavior of INIT_FAILED after INIT_SUCCEEDED in InputInitliazed, INIT_SUCCEEDED will cause the RootInputInitializerManager shutdown, so I make the InputIntializer thread sleep 1 second to wait for the RootInputInitializerManager shutdown and catch the exception to throw it. But the issue here is that executor.shutdownNow() is not blocking method, so here would result in a race issue between InputInitializer thread and AsyncDispatcher thread. It's not easy to simulate the behavior of INIT_FAILED after INIT_SUCCEEDED in InputInitializer , so in the new patch I did it in AsyncDispatcher thread. commit 8f8a81f7a17f9018ae4e87bf0fca9d6cdc0a5ba4 (HEAD, origin/master, origin/HEAD, master, TEZ-1703) Author: Jeff Zhang Date: Fri Oct 31 13:30:10 2014 +0800 TEZ-1703. addendum - fix flaky test. (zjffdu) > Exception handling for InputInitializer > --------------------------------------- > > Key: TEZ-1703 > URL: https://issues.apache.org/jira/browse/TEZ-1703 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.5.1 > Reporter: Jeff Zhang > Assignee: Jeff Zhang > Fix For: 0.5.2 > > Attachments: TEZ-1703-2.patch, TEZ-1703-3.patch, TEZ-1703-4.patch, TEZ-1703.patch > > > For handleInputInitializerEvent - this should be fairly straightfoward to handle. At the moment this is an inline call from within the AsyncDispatcher, and will end up causing a RuntimeException. The RuntimeException can be changed to a AMUserCodeException which will take care of this. > For onVertexStateUpdated, this eventually gets invoked from within RootInputInitializerManager. Catching exceptions there and sending a RootInputInitialzierFailedEvent should be enough to fix this ? May require some state machine changes to handle this event on a few more states. -- This message was sent by Atlassian JIRA (v6.3.4#6332)