Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 431A21791F for ; Sun, 12 Apr 2015 23:27:45 +0000 (UTC) Received: (qmail 22514 invoked by uid 500); 12 Apr 2015 23:27:44 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 22428 invoked by uid 500); 12 Apr 2015 23:27:44 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 22412 invoked by uid 99); 12 Apr 2015 23:27:44 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 12 Apr 2015 23:27:44 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id DC05A1CABFC; Sun, 12 Apr 2015 23:27:45 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============6624032358326551144==" MIME-Version: 1.0 Subject: Re: Review Request 32549: HiveOnTez: Union followed by Multi-GB followed by Multi-insert loses data From: "pengcheng xiong" To: "Vikram Dixit Kumaraswamy" , "Gunther Hagleitner" Cc: "pengcheng xiong" , "hive" Date: Sun, 12 Apr 2015 23:27:45 -0000 Message-ID: <20150412232745.14115.75399@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: "pengcheng xiong" X-ReviewGroup: hive X-ReviewRequest-URL: https://reviews.apache.org/r/32549/ X-Sender: "pengcheng xiong" References: <20150410071108.1488.84065@reviews.apache.org> In-Reply-To: <20150410071108.1488.84065@reviews.apache.org> Reply-To: "pengcheng xiong" X-ReviewRequest-Repository: hive-git --===============6624032358326551144== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32549/ ----------------------------------------------------------- (Updated April 12, 2015, 11:27 p.m.) Review request for hive, Gunther Hagleitner and Vikram Dixit Kumaraswamy. Repository: hive-git Description ------- In q.test environment with src table, execute the following query: {code} CREATE TABLE DEST1(key STRING, value STRING) STORED AS TEXTFILE; CREATE TABLE DEST2(key STRING, val1 STRING, val2 STRING) STORED AS TEXTFILE; FROM (select 'tst1' as key, cast(count(1) as string) as value from src s1 UNION all select s2.key as key, s2.value as value from src s2) unionsrc INSERT OVERWRITE TABLE DEST1 SELECT unionsrc.key, COUNT(DISTINCT SUBSTR(unionsrc.value,5)) GROUP BY unionsrc.key INSERT OVERWRITE TABLE DEST2 SELECT unionsrc.key, unionsrc.value, COUNT(DISTINCT SUBSTR(unionsrc.value,5)) GROUP BY unionsrc.key, unionsrc.value; select * from DEST1; select * from DEST2; {code} DEST1 and DEST2 should both have 310 rows. However, DEST2 only has 1 row "tst1 500 1" Diffs (updated) ----- common/src/java/org/apache/hadoop/hive/common/jsonexplain/tez/Vertex.java b45c782 itests/src/test/resources/testconfiguration.properties 1502d80 ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java 90616ad ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezUtils.java 4dcdf91 ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWork.java 0990894 ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezWorkWalker.java 08fd61e ql/src/test/queries/clientpositive/explainuser_2.q 03264ca ql/src/test/queries/clientpositive/tez_union_multiinsert.q PRE-CREATION ql/src/test/results/clientpositive/tez/explainuser_2.q.out 61ebe1a ql/src/test/results/clientpositive/tez/tez_union_multiinsert.q.out PRE-CREATION Diff: https://reviews.apache.org/r/32549/diff/ Testing ------- Thanks, pengcheng xiong --===============6624032358326551144==--