hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-12762) Common join on parquet tables returns incorrect result when hive.optimize.index.filter set to true
Date Tue, 05 Jan 2016 12:23:39 GMT

    [ https://issues.apache.org/jira/browse/HIVE-12762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082963#comment-15082963
] 

Hive QA commented on HIVE-12762:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12780181/HIVE-12762.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 121 failed/errored test(s), 9909 tests executed
*Failed tests:*
{noformat}
TestCliDriver-acid_vectorization.q-smb_mapjoin_2.q-null_column.q-and-12-more - did not produce
a TEST-*.xml file
TestCliDriver-authorization_1_sql_std.q-drop_index_removes_partition_dirs.q-udf_date_sub.q-and-12-more
- did not produce a TEST-*.xml file
TestCliDriver-auto_sortmerge_join_13.q-alter_partition_clusterby_sortby.q-rcfile_merge2.q-and-12-more
- did not produce a TEST-*.xml file
TestCliDriver-parquet_avro_array_of_primitives.q-udf_explode.q-udf4.q-and-12-more - did not
produce a TEST-*.xml file
TestHWISessionManager - did not produce a TEST-*.xml file
TestSparkCliDriver-vector_distinct_2.q-load_dyn_part2.q-udf_percentile.q-and-12-more - did
not produce a TEST-*.xml file
TestSparkCliDriver-vectorization_13.q-mapreduce2.q-auto_join22.q-and-12-more - did not produce
a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_partition
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization_project
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_concatenate_indexed_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_decimal_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_binarysortable_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket_map_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cast1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_unionDistinct_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_varchar_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_colstats_all_nulls
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_column_access_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_boolean
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_date_serde
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby7_noskew_multi_single_reducer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_identity_project_remove_skip
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_unused
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_bucket_sort_convert_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input22
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join18
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join32_lessSize
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nullsafe
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_reorder4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lateral_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_llap_acid
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_llap_uncompressed
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lock4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_test_outer
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_noalias_subq1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_llap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_timestamp2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_varchar1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_createas1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_merge3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rcfile_merge4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_mapwork_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_semicolon
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoin_mapjoin7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column_merge
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_min
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_null
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_24
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_top_level
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_acid3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_left_outer_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorization_3
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_ppd_basic
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_join_part_col_char
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_truncate_column_buckets
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_create_merge_compressed
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_orc_ppd_basic
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_table
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_acid3
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_join_part_col_char
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_dyn_part
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_truncate_column_buckets
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_columnstats_partlvl_multiple_part_clause
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse
org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testCombinationInputFormat
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorization
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorizationWithAcid
org.apache.hadoop.hive.ql.io.orc.TestInputOutputFormat.testVectorizationWithBuckets
org.apache.hadoop.hive.ql.security.authorization.plugin.TestHiveOperationType.checkHiveOperationTypeMatch
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6514/testReport
Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6514/console
Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6514/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 121 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12780181 - PreCommit-HIVE-TRUNK-Build

> Common join on parquet tables returns incorrect result when hive.optimize.index.filter
set to true
> --------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-12762
>                 URL: https://issues.apache.org/jira/browse/HIVE-12762
>             Project: Hive
>          Issue Type: Bug
>          Components: Logical Optimizer
>    Affects Versions: 2.1.0
>            Reporter: Aihua Xu
>            Assignee: Aihua Xu
>         Attachments: HIVE-12762.patch
>
>
> The following query will give incorrect result.
> {noformat}
> CREATE TABLE tbl1(id INT) STORED AS PARQUET;
> INSERT INTO tbl1 VALUES(1), (2);
> CREATE TABLE tbl2(id INT, value STRING) STORED AS PARQUET;
> INSERT INTO tbl2 VALUES(1, 'value1');
> INSERT INTO tbl2 VALUES(1, 'value2');
> set hive.optimize.index.filter = true;
> set hive.auto.convert.join=false;
> select tbl1.id, t1.value, t2.value
> FROM tbl1
> JOIN (SELECT * FROM tbl2 WHERE value='value1') t1 ON tbl1.id=t1.id
> JOIN (SELECT * FROM tbl2 WHERE value='value2') t2 ON tbl1.id=t2.id;
> {noformat}
> We are enforcing to use common join and tbl2 will have 2 files after 2 insertions underneath.
> the map job contains 3 TableScan operators (2 for tbl2 and 1 for tbl1). When    hive.optimize.index.filter
is set to true, we are incorrectly applying the later filtering condition to each block, which
causes no data is returned for the subquery {{SELECT * FROM tbl2 WHERE value='value1'}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message