Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2F8DC11B47 for ; Tue, 23 Sep 2014 17:32:36 +0000 (UTC) Received: (qmail 12443 invoked by uid 500); 23 Sep 2014 17:32:35 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 12331 invoked by uid 500); 23 Sep 2014 17:32:34 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 12312 invoked by uid 500); 23 Sep 2014 17:32:34 -0000 Delivered-To: apmail-hadoop-hive-dev@hadoop.apache.org Received: (qmail 12307 invoked by uid 99); 23 Sep 2014 17:32:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Sep 2014 17:32:34 +0000 Date: Tue, 23 Sep 2014 17:32:34 +0000 (UTC) From: "Chao (JIRA)" To: hive-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HIVE-8207) Add .q tests for multi-table insertion [Spark Branch] MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-8207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao updated HIVE-8207: ----------------------- Description: Now that multi-table insertion is committed to branch, we should enable those related qtests. Here is a list of qfiles that should be activated (some of them may already be activated). The list may not be comprehensive. {noformat} add_part_multiple.q auto_smb_mapjoin_14.q bucket5.q column_access_stats.q date_udf.q groupby10.q groupby11.q groupby3_map_multi_distinct.q groupby3_map.q groupby3_map_skew.q groupby3_noskew_multi_distinct.q groupby3_noskew.q groupby7_map_multi_single_reducer.q groupby7_map.q groupby7_map_skew.q groupby7_noskew_multi_single_reducer.q groupby7_noskew.q groupby7.q groupby8_map.q groupby8_map_skew.q groupby8_noskew.q groupby8.q groupby9.q groupby_complex_types_multi_single_reducer.q groupby_complex_types.q groupby_cube1.q groupby_map_ppr_multi_distinct.q groupby_map_ppr.q groupby_multi_insert_common_distinct.q groupby_multi_single_reducer2.q groupby_multi_single_reducer3.q groupby_multi_single_reducer.q groupby_position.q groupby_ppr.q groupby_rollup1.q groupby_sort_1_23.q groupby_sort_1.q groupby_sort_skew_1_23.q infer_bucket_sort_multi_insert.q innerjoin.q input12_hadoop20.q input12.q input13.q input14.q input17.q input18.q input1_limit.q input_part2.q insert_into3.q join_nullsafe.q load_dyn_part8.q metadata_only_queries_with_filters.q multigroupby_singlemr.q multi_insert_gby2.q multi_insert_gby3.q multi_insert_gby.q multi_insert_lateral_view.qmulti_insert_move_tasks_share_dependencies.q multi_insert.q parallel.q partition_date2.q pcr.q ppd_multi_insert.q ppd_transform.q smb_mapjoin_11.q smb_mapjoin_12.q smb_mapjoin_13.q smb_mapjoin_15.q smb_mapjoin_16.q stats4.q subquery_multiinsert.q table_access_keys_stats.q tez_dml.q udaf_percentile_approx_20.q udaf_percentile_approx_23.q union17.q union18.q union19.q {noformat} There are some tests that cannot be enabled right now, due to various reasons: # ForwardOperator Issue, including {noformat} groupby7_noskew_multi_single_reducer.q groupby8_map.q groupby8_map_skew.q groupby8_noskew.q groupby8.q groupby9.q groupby10.q groupby_complex_types_multi_single_reducer.q groupby_multi_insert_common_distinct.q union17.q {noformat} *Reason*: currently, if the node to break in the operator tree is a ForwardOperator, we simple do nothing. However, we may have the following case: {noformat} ...... FOR -> RS_0 -> RS_1 \-> RS_2 {noformat} Here, {{RS_0}} leads to both {{RS_1}} and {{RS_2}}, and because of the issue in HIVE-7731 and HIVE-8118, both downstream branches will get duplicated results. # Stats issue, including: {noformat} bucket5.q infer_bucket_sort_multi_insert.q stats4.q smb_mapjoin_13.q smb_mapjoin_15.q {noformat} In these tests, I get diff error because {{numRows}} and {{rawDataSize}} are -1, but they are expected to be some positive value. I don't think this is related to multi-insertion. # Join/SMB Join Issue, including {noformat} auto_smb_mapjoin_14.q auto_sortmerge_join_13.q smb_mapjoin_11.q smb_mapjoin_12.q smb_mapjoin_13.q smb_mapjoin_15.q smb_mapjoin_16.q {noformat} These tests either failed with exception or failed with diff. I think it's because SMB Join (HIVE-8202) isn't supported right now. # Result doesn't match, including {noformat} groupby3_map_skew.q groupby_map_ppr_multi_distinct.q groupby_map_ppr.q partition_date2.q udaf_percentile_approx_23.q {noformat} The results from these tests are different from MR's. But, I don't think they are related to multi-insertion. was: Now that multi-table insertion is committed to branch, we should enable those related qtests. Here is a list of qfiles that should be activated (some of them may already be activated). The list may not be comprehensive. {noformat} add_part_multiple.q auto_smb_mapjoin_14.q bucket5.q column_access_stats.q date_udf.q groupby10.q groupby11.q groupby3_map_multi_distinct.q groupby3_map.q groupby3_map_skew.q groupby3_noskew_multi_distinct.q groupby3_noskew.q groupby7_map_multi_single_reducer.q groupby7_map.q groupby7_map_skew.q groupby7_noskew_multi_single_reducer.q groupby7_noskew.q groupby7.q groupby8_map.q groupby8_map_skew.q groupby8_noskew.q groupby8.q groupby9.q groupby_complex_types_multi_single_reducer.q groupby_complex_types.q groupby_cube1.q groupby_map_ppr_multi_distinct.q groupby_map_ppr.q groupby_multi_insert_common_distinct.q groupby_multi_single_reducer2.q groupby_multi_single_reducer3.q groupby_multi_single_reducer.q groupby_position.q groupby_ppr.q groupby_rollup1.q groupby_sort_1_23.q groupby_sort_1.q groupby_sort_skew_1_23.q infer_bucket_sort_multi_insert.q innerjoin.q input12_hadoop20.q input12.q input13.q input14.q input17.q input18.q input1_limit.q input_part2.q insert_into3.q join_nullsafe.q load_dyn_part8.q metadata_only_queries_with_filters.q multigroupby_singlemr.q multi_insert_gby2.q multi_insert_gby3.q multi_insert_gby.q multi_insert_lateral_view.qmulti_insert_move_tasks_share_dependencies.q multi_insert.q parallel.q partition_date2.q pcr.q ppd_multi_insert.q ppd_transform.q smb_mapjoin_11.q smb_mapjoin_12.q smb_mapjoin_13.q smb_mapjoin_15.q smb_mapjoin_16.q stats4.q subquery_multiinsert.q table_access_keys_stats.q tez_dml.q udaf_percentile_approx_20.q udaf_percentile_approx_23.q union17.q union18.q union19.q {noformat} > Add .q tests for multi-table insertion [Spark Branch] > ----------------------------------------------------- > > Key: HIVE-8207 > URL: https://issues.apache.org/jira/browse/HIVE-8207 > Project: Hive > Issue Type: Test > Components: Spark > Reporter: Chao > Assignee: Chao > Attachments: HIVE-8207.1-spark.patch > > > Now that multi-table insertion is committed to branch, we should enable those related qtests. > Here is a list of qfiles that should be activated (some of them may already be activated). > The list may not be comprehensive. > {noformat} > add_part_multiple.q > auto_smb_mapjoin_14.q > bucket5.q > column_access_stats.q > date_udf.q > groupby10.q > groupby11.q > groupby3_map_multi_distinct.q > groupby3_map.q > groupby3_map_skew.q > groupby3_noskew_multi_distinct.q > groupby3_noskew.q > groupby7_map_multi_single_reducer.q > groupby7_map.q > groupby7_map_skew.q > groupby7_noskew_multi_single_reducer.q > groupby7_noskew.q > groupby7.q > groupby8_map.q > groupby8_map_skew.q > groupby8_noskew.q > groupby8.q > groupby9.q > groupby_complex_types_multi_single_reducer.q > groupby_complex_types.q > groupby_cube1.q > groupby_map_ppr_multi_distinct.q > groupby_map_ppr.q > groupby_multi_insert_common_distinct.q > groupby_multi_single_reducer2.q > groupby_multi_single_reducer3.q > groupby_multi_single_reducer.q > groupby_position.q > groupby_ppr.q > groupby_rollup1.q > groupby_sort_1_23.q > groupby_sort_1.q > groupby_sort_skew_1_23.q > infer_bucket_sort_multi_insert.q > innerjoin.q > input12_hadoop20.q > input12.q > input13.q > input14.q > input17.q > input18.q > input1_limit.q > input_part2.q > insert_into3.q > join_nullsafe.q > load_dyn_part8.q > metadata_only_queries_with_filters.q > multigroupby_singlemr.q > multi_insert_gby2.q > multi_insert_gby3.q > multi_insert_gby.q > multi_insert_lateral_view.qmulti_insert_move_tasks_share_dependencies.q > multi_insert.q > parallel.q > partition_date2.q > pcr.q > ppd_multi_insert.q > ppd_transform.q > smb_mapjoin_11.q > smb_mapjoin_12.q > smb_mapjoin_13.q > smb_mapjoin_15.q > smb_mapjoin_16.q > stats4.q > subquery_multiinsert.q > table_access_keys_stats.q > tez_dml.q > udaf_percentile_approx_20.q > udaf_percentile_approx_23.q > union17.q > union18.q > union19.q > {noformat} > There are some tests that cannot be enabled right now, due to various reasons: > # ForwardOperator Issue, including > {noformat} > groupby7_noskew_multi_single_reducer.q > groupby8_map.q > groupby8_map_skew.q > groupby8_noskew.q > groupby8.q > groupby9.q > groupby10.q > groupby_complex_types_multi_single_reducer.q > groupby_multi_insert_common_distinct.q > union17.q > {noformat} > *Reason*: currently, if the node to break in the operator tree is a ForwardOperator, we simple do nothing. However, we may have the following case: > {noformat} > ...... FOR -> RS_0 -> RS_1 > \-> RS_2 > {noformat} > Here, {{RS_0}} leads to both {{RS_1}} and {{RS_2}}, and because of the issue in HIVE-7731 and HIVE-8118, both downstream branches will get duplicated results. > # Stats issue, including: > {noformat} > bucket5.q > infer_bucket_sort_multi_insert.q > stats4.q > smb_mapjoin_13.q > smb_mapjoin_15.q > {noformat} > In these tests, I get diff error because {{numRows}} and {{rawDataSize}} are -1, but they are expected to be some positive value. I don't think this is related to multi-insertion. > # Join/SMB Join Issue, including > {noformat} > auto_smb_mapjoin_14.q > auto_sortmerge_join_13.q > smb_mapjoin_11.q > smb_mapjoin_12.q > smb_mapjoin_13.q > smb_mapjoin_15.q > smb_mapjoin_16.q > {noformat} > These tests either failed with exception or failed with diff. I think it's because SMB Join (HIVE-8202) isn't supported right now. > # Result doesn't match, including > {noformat} > groupby3_map_skew.q > groupby_map_ppr_multi_distinct.q > groupby_map_ppr.q > partition_date2.q > udaf_percentile_approx_23.q > {noformat} > The results from these tests are different from MR's. But, I don't think they are related to multi-insertion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)