Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id B340F200D02 for ; Fri, 8 Sep 2017 22:36:04 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id B19AE1609A7; Fri, 8 Sep 2017 20:36:04 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 0B79F1609BE for ; Fri, 8 Sep 2017 22:36:03 +0200 (CEST) Received: (qmail 31859 invoked by uid 500); 8 Sep 2017 20:36:03 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 31850 invoked by uid 99); 8 Sep 2017 20:36:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Sep 2017 20:36:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id A4C2A1A7B88 for ; Fri, 8 Sep 2017 20:36:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id f48ES5jqEzof for ; Fri, 8 Sep 2017 20:36:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id A6B9D5F27D for ; Fri, 8 Sep 2017 20:36:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id D9110E0EFB for ; Fri, 8 Sep 2017 20:36:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 4A42B24157 for ; Fri, 8 Sep 2017 20:36:00 +0000 (UTC) Date: Fri, 8 Sep 2017 20:36:00 +0000 (UTC) From: "Prasanth Jayachandran (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HIVE-17403) Fail concatenation for unmanaged and transactional tables MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 08 Sep 2017 20:36:04 -0000 [ https://issues.apache.org/jira/browse/HIVE-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17403: ----------------------------------------- Attachment: HIVE-17403.2.patch > Fail concatenation for unmanaged and transactional tables > --------------------------------------------------------- > > Key: HIVE-17403 > URL: https://issues.apache.org/jira/browse/HIVE-17403 > Project: Hive > Issue Type: Bug > Affects Versions: 1.3.0, 3.0.0, 2.4.0 > Reporter: Prasanth Jayachandran > Assignee: Prasanth Jayachandran > Priority: Blocker > Attachments: HIVE-17403.1.patch, HIVE-17403.2.patch > > > ALTER TABLE .. CONCATENATE should fail if the table is not managed by hive. > For unmanaged tables, file names can be anything. Hive has some assumptions about file names which can result in data loss for unmanaged tables. > Example of this is a table/partition having 2 different files files (part-m-00000__1417075294718 and part-m-00018__1417075294718). Although both are completely different files, hive thinks these are files generated by separate instances of same task (because of failure or speculative execution). Hive will end up removing this file > {code} > 2017-08-28T18:19:29,516 WARN [b27f10d5-d957-4695-ab2a-1453401793df main]: exec.Utilities (:()) - Duplicate taskid file removed: file:/Users/table/part=20141120/.hive-staging_hive_2017-08-28_18-19-27_210_3381701454205724533-1/_tmp.-ext-10000/part-m-00018__1417075294718 with length 958510. Existing file: file:/Users/table/part=20141120/.hive-staging_hive_2017-08-28_18-19-27_210_3381701454205724533-1/_tmp.-ext-10000/part-m-00000__1417075294718 with length 1123116 > {code} > DDL should restrict concatenation for unmanaged tables. -- This message was sent by Atlassian JIRA (v6.4.14#64029)