hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zoltan Haindrich <k...@rxd.hu>
Subject Re: Review Request 63442: HIVE-17934 Merging Statistics are promoted to COMPLETE (most of the time)
Date Thu, 09 Nov 2017 17:41:26 GMT


> On Nov. 3, 2017, 11:54 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
> > Line 1656 (original), 1656 (patched)
> > <https://reviews.apache.org/r/63442/diff/1/?file=1873420#file1873420line1657>
> >
> >     why are we setting state to partial here? For operators other than TableScan
we derive stats and we keep state as is.

this turned out to be a really bad idea...and caused a lot of regressions.
the new patch is much more conservative; and tries to only degrade the stats state if its
neccessary.


- Zoltan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63442/#review190080
-----------------------------------------------------------


On Nov. 9, 2017, 5:39 p.m., Zoltan Haindrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63442/
> -----------------------------------------------------------
> 
> (Updated Nov. 9, 2017, 5:39 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-17934
>     https://issues.apache.org/jira/browse/HIVE-17934
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> * remove the reactive stat state guessing method
> * make the guessing only work when a new object is created
> * change the way stat objects are merged
> 
> this patch will most probably break almost all qtest outputs....
> 
> 
> Diffs
> -----
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out b3adf4e504 
>   hbase-handler/src/test/results/positive/hbase_queries.q.out b2eda12e95 
>   hbase-handler/src/test/results/positive/hbasestats.q.out 29eefd43a9 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/SparkMapJoinOptimizer.java 7a3fae65e8

>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
a4f60accce 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/Statistics.java 8ffb4ce44b 
>   ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java ce7c96c639 
>   ql/src/test/queries/clientpositive/lateral_view_onview2.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/stats_empty_partition2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/acid_table_stats.q.out 351ff0da0a 
>   ql/src/test/results/clientpositive/alterColumnStatsPart.q.out 858e16fe22 
>   ql/src/test/results/clientpositive/annotate_stats_part.q.out 3a94a6a4e3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out 7875e9693a 
>   ql/src/test/results/clientpositive/cbo_const.q.out e9f885b363 
>   ql/src/test/results/clientpositive/cbo_input26.q.out 77fc194829 
>   ql/src/test/results/clientpositive/columnstats_partlvl_dp.q.out 414b715b7a 
>   ql/src/test/results/clientpositive/columnstats_quoting.q.out 683c1e274f 
>   ql/src/test/results/clientpositive/columnstats_tbllvl.q.out a2c6ead293 
>   ql/src/test/results/clientpositive/constGby.q.out c633624935 
>   ql/src/test/results/clientpositive/constant_prop_3.q.out cba4744866 
>   ql/src/test/results/clientpositive/constprog3.q.out f54168d0ee 
>   ql/src/test/results/clientpositive/correlationoptimizer10.q.out a03acd38a7 
>   ql/src/test/results/clientpositive/correlationoptimizer11.q.out cf2250790a 
>   ql/src/test/results/clientpositive/correlationoptimizer13.q.out 6d4f931213 
>   ql/src/test/results/clientpositive/correlationoptimizer14.q.out 149f33fee8 
>   ql/src/test/results/clientpositive/correlationoptimizer15.q.out 2d813b239f 
>   ql/src/test/results/clientpositive/correlationoptimizer5.q.out 68d6a54862 
>   ql/src/test/results/clientpositive/correlationoptimizer7.q.out 82fecab594 
>   ql/src/test/results/clientpositive/correlationoptimizer8.q.out f3cb988a03 
>   ql/src/test/results/clientpositive/correlationoptimizer9.q.out 5372408d2a 
>   ql/src/test/results/clientpositive/cte_mat_5.q.out 3747cec891 
>   ql/src/test/results/clientpositive/display_colstats_tbllvl.q.out 8e2e77b077 
>   ql/src/test/results/clientpositive/druid_basic2.q.out 753ccb456f 
>   ql/src/test/results/clientpositive/empty_join.q.out a4a9976a7f 
>   ql/src/test/results/clientpositive/filter_cond_pushdown_HIVE_15647.q.out 779bea3a26

>   ql/src/test/results/clientpositive/groupby_sort_6.q.out a66ec97642 
>   ql/src/test/results/clientpositive/having2.q.out 80301bfc04 
>   ql/src/test/results/clientpositive/input23.q.out 80ee81b654 
>   ql/src/test/results/clientpositive/input26.q.out 1ac082eedf 
>   ql/src/test/results/clientpositive/join_cond_pushdown_unqual1.q.out 74f45e58c0 
>   ql/src/test/results/clientpositive/join_cond_pushdown_unqual2.q.out 2ac67b294c 
>   ql/src/test/results/clientpositive/join_cond_pushdown_unqual3.q.out b8d9b408d7 
>   ql/src/test/results/clientpositive/join_cond_pushdown_unqual4.q.out e5ddc3507f 
>   ql/src/test/results/clientpositive/join_view.q.out 1d83742dd4 
>   ql/src/test/results/clientpositive/lateral_view_onview.q.out 423885e442 
>   ql/src/test/results/clientpositive/lateral_view_onview2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/list_bucket_query_oneskew_2.q.out 876434fb4e 
>   ql/src/test/results/clientpositive/llap/auto_sortmerge_join_12.q.out 3acbb207a7 
>   ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction.q.out 67fe41e223

>   ql/src/test/results/clientpositive/llap/dynamic_semijoin_reduction_sw.q.out 1c672ef068

>   ql/src/test/results/clientpositive/llap/dynamic_semijoin_user_level.q.out a51637a2b9

>   ql/src/test/results/clientpositive/llap/dynpart_sort_optimization_acid.q.out 02cadb7cff

>   ql/src/test/results/clientpositive/llap/llap_nullscan.q.out 2a891234e5 
>   ql/src/test/results/clientpositive/llap/mapjoin_hint.q.out 505524e78c 
>   ql/src/test/results/clientpositive/llap/mapreduce1.q.out 0e94e71d27 
>   ql/src/test/results/clientpositive/llap/mapreduce2.q.out 6485f587f8 
>   ql/src/test/results/clientpositive/llap/metadataonly1.q.out e6853b23e3 
>   ql/src/test/results/clientpositive/llap/reduce_deduplicate.q.out 65b74ee319 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out c7b98d3967 
>   ql/src/test/results/clientpositive/llap/subquery_multi.q.out d1579033ac 
>   ql/src/test/results/clientpositive/llap/subquery_null_agg.q.out 78ee174935 
>   ql/src/test/results/clientpositive/llap/subquery_scalar.q.out 06a929dd0a 
>   ql/src/test/results/clientpositive/llap/subquery_select.q.out 514a7889b3 
>   ql/src/test/results/clientpositive/llap/tez_smb_empty.q.out 7a4db158c8 
>   ql/src/test/results/clientpositive/llap/vector_windowing_gby2.q.out ce1881b7fb 
>   ql/src/test/results/clientpositive/llap/vector_windowing_streaming.q.out 61730f59ee

>   ql/src/test/results/clientpositive/llap/vectorization_short_regress.q.out 3e246bcbe6

>   ql/src/test/results/clientpositive/materialized_view_rewrite_ssb.q.out de491989a5 
>   ql/src/test/results/clientpositive/materialized_view_rewrite_ssb_2.q.out a11d66815a

>   ql/src/test/results/clientpositive/nullgroup3.q.out fe23f39fd8 
>   ql/src/test/results/clientpositive/nullgroup5.q.out 783f6d76b6 
>   ql/src/test/results/clientpositive/partial_column_stats.q.out 44db81a443 
>   ql/src/test/results/clientpositive/perf/spark/query66.q.out 1dc0fac408 
>   ql/src/test/results/clientpositive/perf/spark/query99.q.out c0c5f136ec 
>   ql/src/test/results/clientpositive/position_alias_test_1.q.out ee81a79a0b 
>   ql/src/test/results/clientpositive/ppd_outer_join5.q.out 84c10828ce 
>   ql/src/test/results/clientpositive/ppd_repeated_alias.q.out c94002f37d 
>   ql/src/test/results/clientpositive/row__id.q.out 9aab097f21 
>   ql/src/test/results/clientpositive/semijoin4.q.out 53f6c174bd 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out 09caf944d2 
>   ql/src/test/results/clientpositive/spark/join_cond_pushdown_unqual1.q.out dc9b61e39a

>   ql/src/test/results/clientpositive/spark/join_cond_pushdown_unqual2.q.out 82634fba44

>   ql/src/test/results/clientpositive/spark/join_cond_pushdown_unqual3.q.out d1b20006b0

>   ql/src/test/results/clientpositive/spark/join_cond_pushdown_unqual4.q.out 2bfc81d275

>   ql/src/test/results/clientpositive/spark/join_view.q.out 61867f75f3 
>   ql/src/test/results/clientpositive/spark/optimize_nullscan.q.out d294f4910c 
>   ql/src/test/results/clientpositive/spark/ppd_outer_join5.q.out e49260aa35 
>   ql/src/test/results/clientpositive/spark/semijoin.q.out d2dac10f3f 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out e2f68a02bc 
>   ql/src/test/results/clientpositive/spark/spark_dynamic_partition_pruning.q.out d7b445baf8

>   ql/src/test/results/clientpositive/spark/spark_vectorized_dynamic_partition_pruning.q.out
1a8e9ffcc5 
>   ql/src/test/results/clientpositive/spark/subquery_in.q.out fd25e36fba 
>   ql/src/test/results/clientpositive/spark/subquery_multi.q.out b91c33ee4a 
>   ql/src/test/results/clientpositive/spark/subquery_null_agg.q.out 945e2a7102 
>   ql/src/test/results/clientpositive/spark/subquery_scalar.q.out 8f3ac0d636 
>   ql/src/test/results/clientpositive/spark/subquery_select.q.out edb2b92f73 
>   ql/src/test/results/clientpositive/spark/union_remove_25.q.out f681428785 
>   ql/src/test/results/clientpositive/spark/vectorization_short_regress.q.out 78740fec6f

>   ql/src/test/results/clientpositive/stats_empty_partition2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/subquery_exists_having.q.out ef06dfe697 
>   ql/src/test/results/clientpositive/subquery_unqualcolumnrefs.q.out 79b7d83619 
>   ql/src/test/results/clientpositive/temp_table_display_colstats_tbllvl.q.out a202e45be9

>   ql/src/test/results/clientpositive/union_remove_25.q.out 20ab809cb1 
>   ql/src/test/results/clientpositive/union_view.q.out 35f8a9a226 
> 
> 
> Diff: https://reviews.apache.org/r/63442/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zoltan Haindrich
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message