Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 701E8200B6F for ; Wed, 10 Aug 2016 00:21:23 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 6EE51160AA5; Tue, 9 Aug 2016 22:21:23 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id C50EC160AB7 for ; Wed, 10 Aug 2016 00:21:22 +0200 (CEST) Received: (qmail 96975 invoked by uid 500); 9 Aug 2016 22:21:21 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 96369 invoked by uid 99); 9 Aug 2016 22:21:21 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Aug 2016 22:21:21 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id D1A202C02A5 for ; Tue, 9 Aug 2016 22:21:20 +0000 (UTC) Date: Tue, 9 Aug 2016 22:21:20 +0000 (UTC) From: "Mithun Radhakrishnan (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-13756) Map failure attempts to delete reducer _temporary directory on multi-query pig query MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 09 Aug 2016 22:21:23 -0000 [ https://issues.apache.org/jira/browse/HIVE-13756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414342#comment-15414342 ] Mithun Radhakrishnan commented on HIVE-13756: --------------------------------------------- +1. > Map failure attempts to delete reducer _temporary directory on multi-query pig query > ------------------------------------------------------------------------------------ > > Key: HIVE-13756 > URL: https://issues.apache.org/jira/browse/HIVE-13756 > Project: Hive > Issue Type: Bug > Components: HCatalog > Affects Versions: 1.2.1, 2.0.0 > Reporter: Chris Drome > Assignee: Chris Drome > Attachments: HIVE-13756-branch-1.patch, HIVE-13756.1-branch-1.patch, HIVE-13756.1.patch, HIVE-13756.patch > > > A pig script, executed with multi-query enabled, that reads the source data and writes it as-is into TABLE_A as well as performing a group-by operation on the data which is written into TABLE_B can produce erroneous results if any map fails. This results in a single MR job that writes the map output to a scratch directory relative to TABLE_A and the reducer output to a scratch directory relative to TABLE_B. > If one or more maps fail it will delete the attempt data relative to TABLE_A, but it also deletes the _temporary directory relative to TABLE_B. This has the unintended side-effect of preventing subsequent maps from committing their data. This means that any maps which successfully completed before the first map failure will have its data committed as expected, other maps not, resulting in an incomplete result set. -- This message was sent by Atlassian JIRA (v6.3.4#6332)