hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Illya Yalovyy" <yalov...@amazon.com>
Subject Review Request 38268: HIVE-10980 Merge of dynamic partitions loads all data to default partition
Date Thu, 10 Sep 2015 20:46:03 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38268/
-----------------------------------------------------------

Review request for hive and Gopal V.


Bugs: HIVE-10980
    https://issues.apache.org/jira/browse/HIVE-10980


Repository: hive-git


Description
-------

https://issues.apache.org/jira/browse/HIVE-10980

Conditions that lead to the issue:
1. Execution engine set to MapReduce
2. Partition columns have different types
3. Both static and dynamic partitions are used in the query
4. Dynamically generated partitions require merge

Result: Final data is loaded to "__HIVE_DEFAULT_PARTITION__".

Steps to reproduce:
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=strict;
set hive.optimize.sort.dynamic.partition=false;
set hive.merge.mapfiles=true;
set hive.merge.mapredfiles=true;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
set hive.execution.engine=mr;

create external table sdp (
  dataint bigint,
  hour int,
  req string,
  cid string,
  caid string
)
row format delimited
fields terminated by ',';

load data local inpath '../../data/files/dynpartdata1.txt' into table sdp;
load data local inpath '../../data/files/dynpartdata2.txt' into table sdp;
...
load data local inpath '../../data/files/dynpartdataN.txt' into table sdp;

create table tdp (cid string, caid string)
partitioned by (dataint bigint, hour int, req string);

insert overwrite table tdp partition (dataint=20150316, hour=16, req)
select cid, caid, req from sdp where dataint=20150316 and hour=16;

select * from tdp order by caid;
show partitions tdp;

Example of the input file:
20150316,16,reqA,clusterIdA,cacheId1            
20150316,16,reqB,clusterIdB,cacheId2         
20150316,16,reqA,clusterIdC,cacheId3          
20150316,16,reqD,clusterIdD,cacheId4        
20150316,16,reqA,clusterIdA,cacheId5      

Actual result:
clusterIdA      cacheId1        20150316        16      __HIVE_DEFAULT_PARTITION__ 
clusterIdA      cacheId1        20150316        16      __HIVE_DEFAULT_PARTITION__
clusterIdB      cacheId2        20150316        16      __HIVE_DEFAULT_PARTITION__
clusterIdC      cacheId3        20150316        16      __HIVE_DEFAULT_PARTITION__
clusterIdD      cacheId4        20150316        16      __HIVE_DEFAULT_PARTITION__
clusterIdA      cacheId5        20150316        16      __HIVE_DEFAULT_PARTITION__
clusterIdD      cacheId8        20150316        16      __HIVE_DEFAULT_PARTITION__
clusterIdB      cacheId9        20150316        16      __HIVE_DEFAULT_PARTITION__       
                                                                                
dataint=20150316/hour=16/req=__HIVE_DEFAULT_PARTITION__


Diffs
-----

  data/files/dynpartdata1.txt PRE-CREATION 
  data/files/dynpartdata2.txt PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 4a325fb 
  ql/src/test/org/apache/hadoop/hive/ql/optimizer/TestGenMapRedUtilsUsePartitionColumnsNegative.java
PRE-CREATION 
  ql/src/test/org/apache/hadoop/hive/ql/optimizer/TestGenMapRedUtilsUsePartitionColumnsPositive.java
PRE-CREATION 
  ql/src/test/queries/clientpositive/dynpart_merge.q PRE-CREATION 
  ql/src/test/results/clientpositive/dynpart_merge.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/list_bucket_dml_6.q.java1.7.out d223234 
  ql/src/test/results/clientpositive/list_bucket_dml_6.q.java1.8.out f884ace 
  ql/src/test/results/clientpositive/list_bucket_dml_7.q.out 541944d 

Diff: https://reviews.apache.org/r/38268/diff/


Testing
-------

1. Added new unit tests
2. Added qtest
3. Updated old qtests


Thanks,

Illya Yalovyy


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message