hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-16811) Estimate statistics in absence of stats
Date Thu, 03 Aug 2017 05:56:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-16811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112228#comment-16112228
] 

Hive QA commented on HIVE-16811:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12880138/HIVE-16811.4.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 80 failed/errored test(s), 11139 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_filter] (batchId=8)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_groupby] (batchId=47)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_select] (batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[annotate_stats_table] (batchId=20)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_annotate_stats_groupby] (batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[columnStatsUpdateForStatsOptimizer_2]
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_uncompressed] (batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_47] (batchId=28)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[udaf_collect_set_2] (batchId=158)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning_mapjoin_only]
(batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
(batchId=169)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_use_op_stats]
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_vectorized_dynamic_partition_pruning]
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=99)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[hybridgrace_hashjoin_1] (batchId=99)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[multi_count_distinct] (batchId=99)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[tez-tag] (batchId=99)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query11] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query15] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query17] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query18] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query19] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query21] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query24] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query25] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query29] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query30] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query31] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query32] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query34] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query35] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query37] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query40] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query44] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query45] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query46] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query47] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query48] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query4] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query50] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query53] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query54] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query57] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query58] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query61] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query63] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query64] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query65] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query67] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query68] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query6] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query72] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query73] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query74] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query75] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query76] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query77] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query78] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query79] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query80] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query81] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query82] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query83] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query85] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query88] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query89] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query8] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query90] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query91] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query92] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query95] (batchId=236)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query99] (batchId=236)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
(batchId=179)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=179)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6241/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6241/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6241/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 80 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12880138 - PreCommit-HIVE-Build

> Estimate statistics in absence of stats
> ---------------------------------------
>
>                 Key: HIVE-16811
>                 URL: https://issues.apache.org/jira/browse/HIVE-16811
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Vineet Garg
>            Assignee: Vineet Garg
>         Attachments: HIVE-16811.1.patch, HIVE-16811.2.patch, HIVE-16811.3.patch, HIVE-16811.4.patch
>
>
> Currently Join ordering completely bails out in absence of statistics and this could
lead to bad joins such as cross joins.
> e.g. following select query will produce cross join.
> {code:sql}
> create table supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING, S_NATIONKEY INT,

> S_PHONE STRING, S_ACCTBAL DOUBLE, S_COMMENT STRING)
> CREATE TABLE lineitem (L_ORDERKEY      INT,
>                                 L_PARTKEY       INT,
>                                 L_SUPPKEY       INT,
>                                 L_LINENUMBER    INT,
>                                 L_QUANTITY      DOUBLE,
>                                 L_EXTENDEDPRICE DOUBLE,
>                                 L_DISCOUNT      DOUBLE,
>                                 L_TAX           DOUBLE,
>                                 L_RETURNFLAG    STRING,
>                                 L_LINESTATUS    STRING,
>                                 l_shipdate      STRING,
>                                 L_COMMITDATE    STRING,
>                                 L_RECEIPTDATE   STRING,
>                                 L_SHIPINSTRUCT  STRING,
>                                 L_SHIPMODE      STRING,
>                                 L_COMMENT       STRING) partitioned by (dl int)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|';
> CREATE TABLE part(
>     p_partkey INT,
>     p_name STRING,
>     p_mfgr STRING,
>     p_brand STRING,
>     p_type STRING,
>     p_size INT,
>     p_container STRING,
>     p_retailprice DOUBLE,
>     p_comment STRING
> );
> explain select count(1) from part,supplier,lineitem where p_partkey = l_partkey and s_suppkey
= l_suppkey;
> {code}
> Estimating stats will prevent join ordering algorithm to bail out and come up with join
at least better than cross join 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message