Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 926C917C9A for ; Tue, 7 Oct 2014 00:50:35 +0000 (UTC) Received: (qmail 73735 invoked by uid 500); 7 Oct 2014 00:50:35 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 73654 invoked by uid 500); 7 Oct 2014 00:50:35 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 73639 invoked by uid 500); 7 Oct 2014 00:50:35 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 73636 invoked by uid 99); 7 Oct 2014 00:50:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Oct 2014 00:50:35 +0000 Date: Tue, 7 Oct 2014 00:50:35 +0000 (UTC) From: "Sushanth Sowmyan (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-8371) HCatStorer should fail by default when publishing to an existing partition MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161292#comment-14161292 ] Sushanth Sowmyan commented on HIVE-8371: ---------------------------------------- Edit: {quote} I am okay with implementing (b) if you want to have a safeguard default. {quote} Err, that bit should instead read : {quote} I am okay with implementing (b) if you want to have the ability to invoke HCatStorer that always maintains that safeguard. {quote} > HCatStorer should fail by default when publishing to an existing partition > -------------------------------------------------------------------------- > > Key: HIVE-8371 > URL: https://issues.apache.org/jira/browse/HIVE-8371 > Project: Hive > Issue Type: Bug > Components: HCatalog > Affects Versions: 0.13.0, 0.14.0, 0.13.1 > Reporter: Thiruvel Thirumoolan > Assignee: Thiruvel Thirumoolan > Labels: hcatalog, partition > > In Hive-12 and before (on in previous HCatalog releases) HCatStorer would fail if the partition already exists (whether before launching the job or during commit depending on the partitioning). HIVE-6406 changed that behavior and by default does an append. This causes data quality issues since an rerun (or duplicate run) won't fail (when it used to) and will just append to the partition. > A preferable approach would be to leave HCatStorer behavior as is (fail during a duplicate publish) and support append through an option. Overwrite also can be implemented in a similar fashion. Eg: > store A into 'db.table' using org.apache.hive.hcatalog.pig.HCatStorer('partspec', '', ' -append'); -- This message was sent by Atlassian JIRA (v6.3.4#6332)