Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id F0879200BC2 for ; Thu, 17 Nov 2016 19:31:59 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id EFA04160B18; Thu, 17 Nov 2016 18:31:59 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 43F20160AD8 for ; Thu, 17 Nov 2016 19:31:59 +0100 (CET) Received: (qmail 64964 invoked by uid 500); 17 Nov 2016 18:31:58 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 64952 invoked by uid 99); 17 Nov 2016 18:31:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Nov 2016 18:31:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 6142D2C0059 for ; Thu, 17 Nov 2016 18:31:58 +0000 (UTC) Date: Thu, 17 Nov 2016 18:31:58 +0000 (UTC) From: =?utf-8?Q?Sergio_Pe=C3=B1a_=28JIRA=29?= To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-15199) INSERT INTO data on S3 is replacing the old rows with the new ones MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 17 Nov 2016 18:32:00 -0000 [ https://issues.apache.org/jira/browse/HIVE-15199?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1567= 4461#comment-15674461 ]=20 Sergio Pe=C3=B1a commented on HIVE-15199: ------------------------------------ Attached a new patch that addresses feedback comments. This patch will call= listFiles() whatever the filesystem is, HDFS or S3, and it will call the r= ename() method on S3 as well to take advantage of the server-side copy. [~steve_l] Thanks for creating the bug on HADOOP. Your suggestion about the= exception, when will that happen? When the destination file already exists= ? Isn't going to be inconsistent with the HDFS rename() that it doesn't thr= ow the exception? > INSERT INTO data on S3 is replacing the old rows with the new ones > ------------------------------------------------------------------ > > Key: HIVE-15199 > URL: https://issues.apache.org/jira/browse/HIVE-15199 > Project: Hive > Issue Type: Bug > Components: Hive > Reporter: Sergio Pe=C3=B1a > Assignee: Sergio Pe=C3=B1a > Priority: Critical > Attachments: HIVE-15199.1.patch, HIVE-15199.2.patch, HIVE-15199.3= .patch, HIVE-15199.4.patch > > > Any INSERT INTO statement run on S3 tables and when the scratch directory= is saved on S3 is deleting old rows of the table. > {noformat} > hive> set hive.blobstore.use.blobstore.as.scratchdir=3Dtrue; > hive> create table t1 (id int, name string) location 's3a://spena-bucket/= t1'; > hive> insert into table t1 values (1,'name1'); > hive> select * from t1; > 1 name1 > hive> insert into table t1 values (2,'name2'); > hive> select * from t1; > 2 name2 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)