Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 96665200CFD for ; Thu, 17 Aug 2017 00:22:06 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 94AE1169C25; Wed, 16 Aug 2017 22:22:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D6F5B169C20 for ; Thu, 17 Aug 2017 00:22:05 +0200 (CEST) Received: (qmail 65199 invoked by uid 500); 16 Aug 2017 22:22:04 -0000 Mailing-List: contact dev-help@pig.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@pig.apache.org Delivered-To: mailing list dev@pig.apache.org Received: (qmail 64824 invoked by uid 500); 16 Aug 2017 22:22:04 -0000 Delivered-To: apmail-hadoop-pig-dev@hadoop.apache.org Received: (qmail 64820 invoked by uid 99); 16 Aug 2017 22:22:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Aug 2017 22:22:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id D6C5B1A19C8 for ; Wed, 16 Aug 2017 22:22:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id yyMnGtdH0Yzc for ; Wed, 16 Aug 2017 22:22:03 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id AE1AB618AF for ; Wed, 16 Aug 2017 22:22:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 6109FE0237 for ; Wed, 16 Aug 2017 22:22:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 802C025391 for ; Wed, 16 Aug 2017 22:22:00 +0000 (UTC) Date: Wed, 16 Aug 2017 22:22:00 +0000 (UTC) From: "Satish Subhashrao Saley (JIRA)" To: pig-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (PIG-5273) _SUCCESS file should be created at the end of the job MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 16 Aug 2017 22:22:06 -0000 [ https://issues.apache.org/jira/browse/PIG-5273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Subhashrao Saley updated PIG-5273: ----------------------------------------- Status: Patch Available (was: Open) > _SUCCESS file should be created at the end of the job > ----------------------------------------------------- > > Key: PIG-5273 > URL: https://issues.apache.org/jira/browse/PIG-5273 > Project: Pig > Issue Type: Bug > Reporter: Satish Subhashrao Saley > Assignee: Satish Subhashrao Saley > Attachments: PIG-5273-1.patch > > > One of the users ran into issues because _SUCCESS file was created by FileOutputCommitter.commitJob() and storeCleanup() called after that in PigOutputCommitter failed to store schema due to network outage. abortJob was then called and the StoreFunc.cleanupOnFailure method in it deleted the output directory. Downstream jobs that started because of _SUCCESS file ran with empty data > Possible solutions: > 1) Move storeCleanup before commit. Found that order was reversed in https://issues.apache.org/jira/browse/PIG-2642, probably due to FileOutputCommitter version 1 and might not be a problem with FileOutputCommitter version 2. This would still not help when there are multiple outputs as main problem is cleanupOnFailure in abortJob deleting directories. > 2) We can change cleanupOnFailure not delete output directories. It still does not help. The Oozie action retry might kick in and delete the directory while the downstream has already started running because of the _SUCCESS file. > 3) It cannot be done in the OutputCommitter at all as multiple output committers are called in parallel in Tez. We can have Pig suppress _SUCCESS creation and try creating them all at the end in TezLauncher if job has succeeded before calling cleanupOnSuccess. Can probably add it as a configurable setting and turn on by default in our clusters. This is probably the possible solution > Thank you [~rohini] for finding out the issue and providing solution. -- This message was sent by Atlassian JIRA (v6.4.14#64029)