Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5C3A018865 for ; Thu, 11 Jun 2015 05:20:03 +0000 (UTC) Received: (qmail 82731 invoked by uid 500); 11 Jun 2015 05:20:02 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 82632 invoked by uid 500); 11 Jun 2015 05:20:02 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 82368 invoked by uid 99); 11 Jun 2015 05:20:02 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Jun 2015 05:20:02 +0000 Date: Thu, 11 Jun 2015 05:20:02 +0000 (UTC) From: "Illya Yalovyy (JIRA)" To: dev@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HIVE-10980) Merge of dynamic partitions loads all data to default partition MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Illya Yalovyy created HIVE-10980: ------------------------------------ Summary: Merge of dynamic partitions loads all data to default partition Key: HIVE-10980 URL: https://issues.apache.org/jira/browse/HIVE-10980 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 0.14.0 Environment: HDP 2.2.4 (also reproduced on apache hive built from trunk) Reporter: Illya Yalovyy Conditions that lead to the issue: 1. Partition columns have different types 2. Both static and dynamic partitions are used in the query 3. Dynamically generated partitions require merge Result: Final data is loaded to "__HIVE_DEFAULT_PARTITION__". Steps to reproduce: set hive.exec.dynamic.partition=true; set hive.exec.dynamic.partition.mode=strict; set hive.optimize.sort.dynamic.partition=false; set hive.merge.mapfiles=true; set hive.merge.mapredfiles=true; set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; create external table sdp ( dataint bigint, hour int, req string, cid string, caid string ) row format delimited fields terminated by ','; load data local inpath '../../data/files/dynpartdata1.txt' into table sdp; load data local inpath '../../data/files/dynpartdata2.txt' into table sdp; ... load data local inpath '../../data/files/dynpartdataN.txt' into table sdp; create table tdp (cid string, caid string) partitioned by (dataint bigint, hour int, req string); insert overwrite table tdp partition (dataint=20150316, hour=16, req) select cid, caid, req from sdp where dataint=20150316 and hour=16; select * from tdp order by caid; show partitions tdp; Example of the input file: 20150316,16,reqA,clusterIdA,cacheId1 20150316,16,reqB,clusterIdB,cacheId2 20150316,16,reqA,clusterIdC,cacheId3 20150316,16,reqD,clusterIdD,cacheId4 20150316,16,reqA,clusterIdA,cacheId5 Actual result: clusterIdA cacheId1 20150316 16 __HIVE_DEFAULT_PARTITION__ clusterIdA cacheId1 20150316 16 __HIVE_DEFAULT_PARTITION__ clusterIdB cacheId2 20150316 16 __HIVE_DEFAULT_PARTITION__ clusterIdC cacheId3 20150316 16 __HIVE_DEFAULT_PARTITION__ clusterIdD cacheId4 20150316 16 __HIVE_DEFAULT_PARTITION__ clusterIdA cacheId5 20150316 16 __HIVE_DEFAULT_PARTITION__ clusterIdD cacheId8 20150316 16 __HIVE_DEFAULT_PARTITION__ clusterIdB cacheId9 20150316 16 __HIVE_DEFAULT_PARTITION__ dataint=20150316/hour=16/req=__HIVE_DEFAULT_PARTITION__ -- This message was sent by Atlassian JIRA (v6.3.4#6332)