hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From pengcheng xiong <pxi...@hortonworks.com>
Subject Re: Review Request 59468: Optimize a combination of avg(), sum(), count(distinct) etc
Date Sat, 27 May 2017 19:39:19 GMT


> On May 27, 2017, 4:41 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/queries/clientpositive/count_dist_rewrite.q
> > Lines 63-65 (patched)
> > <https://reviews.apache.org/r/59468/diff/3/?file=1733999#file1733999line63>
> >
> >     As mentioned previously, lets delete these tests.

I assume that previoulsy you want some negative test for this? no?


> On May 27, 2017, 4:41 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/perf/query16.q.out
> > Lines 3-5 (original), 3-5 (patched)
> > <https://reviews.apache.org/r/59468/diff/3/?file=1734004#file1734004line3>
> >
> >     Optimization shouldn't have fired in this case. Aggregations are on different
columns.

IMHO, i think it should fire in this case. In this case and the following ones, there will
be a single reducer producing a single row with constant group by key, i.e., everything should
go to the same group. After the patch, in the first stage, we just introduce the partial result
with group by the distinct column. Then in the second stage, we aggregate all of the partial
results together. I think this is exactly what you want previously, i.e., use extra stage
to reduce the result step by step. Please correct me if my understanding is wrong... FYI,
we also have test cases in count_dist_rewrite.q to cover this.


- pengcheng


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59468/#review176243
-----------------------------------------------------------


On May 27, 2017, 2:20 a.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59468/
> -----------------------------------------------------------
> 
> (Updated May 27, 2017, 2:20 a.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Gopal V.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-16654
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2dfc8b6f89 
>   itests/src/test/resources/testconfiguration.properties 47a13c93b9 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 8b04cd44fa 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/CountDistinctRewriteProc.java PRE-CREATION

>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 7dace9076f 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/GroupByDesc.java 38a9ef2af1 
>   ql/src/test/queries/clientpositive/count_dist_rewrite.q PRE-CREATION 
>   ql/src/test/results/clientpositive/count_dist_rewrite.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/groupby_sort_11.q.out 2b3bf4a07a 
>   ql/src/test/results/clientpositive/llap/count_dist_rewrite.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/nullgroup4.q.out e5a8eeee14 
>   ql/src/test/results/clientpositive/perf/query16.q.out cf90c0c162 
>   ql/src/test/results/clientpositive/perf/query28.q.out 78129cf68b 
>   ql/src/test/results/clientpositive/perf/query94.q.out 836b16bf9f 
>   ql/src/test/results/clientpositive/perf/query95.q.out fa94d0842b 
>   ql/src/test/results/clientpositive/udf_count.q.out f60ad0485e 
> 
> 
> Diff: https://reviews.apache.org/r/59468/diff/3/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message