hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang" <xzh...@cloudera.com>
Subject Re: Review Request 34576: Bucketized Table feature fails in some cases
Date Sun, 24 May 2015 01:50:10 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/#review85081
-----------------------------------------------------------


could you also link the JIRA number in the review request?


ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
<https://reviews.apache.org/r/34576/#comment136557>

    nit: remove tab/spacke



ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java
<https://reviews.apache.org/r/34576/#comment136558>

    Warning is proper, but I think the words should say "might" because the source data might
be already bucketed and matches the target, in which case, there is no problem.


- Xuefu Zhang


On May 23, 2015, 5:47 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34576/
> -----------------------------------------------------------
> 
> (Updated May 23, 2015, 5:47 p.m.)
> 
> 
> Review request for hive and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Bucketized Table feature fails in some cases. if src & destination is bucketed on
same key, and if actual data in the src is not bucketed (because data got loaded using LOAD
DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination.
> Example
> ----------------------------------------------------------------------
> CREATE TABLE P1(key STRING, val STRING)
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1;
> – perform an insert to make sure there are 2 files
> INSERT OVERWRITE TABLE P1 select key, val from P1;
> --------------------------------------------------
> This is not a regression. This has never worked.
> This got only discovered due to Hadoop2 changes.
> In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is
requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning
threads).
> Long term solution seems to be to prevent load data for bucketed table.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 1a9b42b 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out f4522d2

>   ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out 9aa9b5d

>   ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 9220c8e

>   ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
>   ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
>   ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
>   ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
>   ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
>   ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
>   ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
>   ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
>   ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
>   ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
>   ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
>   ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
>   ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
>   ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
>   ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
>   ql/src/test/results/clientpositive/bucketizedhiveinputformat_auto.q.out 215efdd 
>   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 72f2a07 
>   ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
>   ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
>   ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
>   ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
>   ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
>   ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
>   ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
>   ql/src/test/results/clientpositive/bucketmapjoin7.q.out 667a9db 
>   ql/src/test/results/clientpositive/bucketmapjoin8.q.out 252b377 
>   ql/src/test/results/clientpositive/bucketmapjoin9.q.out 5e28dc3 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 9a0bfc4 
>   ql/src/test/results/clientpositive/groupby_sort_1_23.q.out 34cd1ff 
>   ql/src/test/results/clientpositive/groupby_sort_2.q.out b5e52f1 
>   ql/src/test/results/clientpositive/groupby_sort_3.q.out c16911a 
>   ql/src/test/results/clientpositive/groupby_sort_4.q.out a6b1c3d 
>   ql/src/test/results/clientpositive/groupby_sort_5.q.out 369e2b5 
>   ql/src/test/results/clientpositive/groupby_sort_7.q.out 7264695 
>   ql/src/test/results/clientpositive/groupby_sort_8.q.out ec16eb0 
>   ql/src/test/results/clientpositive/groupby_sort_9.q.out e49781a 
>   ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out 0d631ce 
>   ql/src/test/results/clientpositive/groupby_sort_test_1.q.out 8c1765d 
>   ql/src/test/results/clientpositive/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/insert_values_orig_table.q.out 684cd1b 
>   ql/src/test/results/clientpositive/join_filters.q.out 4f112bd 
>   ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
>   ql/src/test/results/clientpositive/mergejoin.q.out cb96ab3 
>   ql/src/test/results/clientpositive/skewjoin_mapjoin11.q.out dd084e8 
>   ql/src/test/results/clientpositive/skewjoinopt19.q.out fd43409 
>   ql/src/test/results/clientpositive/skewjoinopt20.q.out a28e433 
>   ql/src/test/results/clientpositive/smb_mapjoin_1.q.out 9ab334b 
>   ql/src/test/results/clientpositive/smb_mapjoin_10.q.out ea2fa51 
>   ql/src/test/results/clientpositive/smb_mapjoin_2.q.out 379dc0d 
>   ql/src/test/results/clientpositive/smb_mapjoin_25.q.out c0a8959 
>   ql/src/test/results/clientpositive/smb_mapjoin_3.q.out 26fa5d4 
>   ql/src/test/results/clientpositive/smb_mapjoin_4.q.out 9fc7f93 
>   ql/src/test/results/clientpositive/smb_mapjoin_5.q.out 6e6882a 
>   ql/src/test/results/clientpositive/smb_mapjoin_7.q.out 82f5804 
>   ql/src/test/results/clientpositive/spark/auto_join32.q.out 361a968 
>   ql/src/test/results/clientpositive/spark/auto_join_filters.q.out 8934433 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 09d2692 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out 8102ec1 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 2ea0a65 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_3.q.out 6281929 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 31e9d86 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 3eceb0b 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out ddbca05 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_8.q.out 88d4dcb 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out 4e8ce0d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out c0a3c3d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark1.q.out 6230bef 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark2.q.out 1a33625 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark3.q.out fed923c 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 65bded2 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez2.q.out 33e6d63 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out 44f4d0c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 678ad54 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out 95606f0 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out d6c25e4 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out d82480e 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 39552c1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out ad2762d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out f7c3d4d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 7bfe440 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out 4601eb1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 60bd103 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 031c46c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 4a8f46d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out a09904e 
>   ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out cfbce61 
>   ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 9343805 
>   ql/src/test/results/clientpositive/spark/skewjoinopt19.q.out eb9bb84 
>   ql/src/test/results/clientpositive/spark/skewjoinopt20.q.out 22de156 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_1.q.out 1ff1262 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_10.q.out cadf08e 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_2.q.out a0d51f3 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_25.q.out cb811ed 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_3.q.out f46b833 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_4.q.out a421a42 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_5.q.out af65010 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out 622b950 
>   ql/src/test/results/clientpositive/stats11.q.out e51f049 
>   ql/src/test/results/clientpositive/tez/auto_join_filters.q.out 8fde41d 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_1.q.out a275d27 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_11.q.out 6ac74ca 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_12.q.out 8c8a3bf 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_2.q.out 2cb8416 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_3.q.out abeceb8 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_4.q.out 8eb9ce5 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_5.q.out adcc1fa 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_7.q.out 2562cb0 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_8.q.out 31b0a97 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez1.q.out 61c197f 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez2.q.out 3f980b6 
>   ql/src/test/results/clientpositive/tez/explainuser_2.q.out f84524b 
>   ql/src/test/results/clientpositive/tez/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/tez/mergejoin.q.out 97df12a 
>   ql/src/test/results/clientpositive/tez/tez_fsstat.q.out 3fcf68c 
>   ql/src/test/results/clientpositive/tez/tez_smb_1.q.out d970bd9 
>   ql/src/test/results/clientpositive/tez/tez_smb_main.q.out 6183390 
>   ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out 14a6874 
> 
> Diff: https://reviews.apache.org/r/34576/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message