hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashutosh Chauhan <hashut...@apache.org>
Subject Re: Review Request 56140: Can't order by an unselected column
Date Wed, 17 May 2017 01:32:16 GMT


> On May 3, 2017, 4:24 a.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/cp_sel.q.out
> > Line 46 (original), 50 (patched)
> > <https://reviews.apache.org/r/56140/diff/8/?file=1704004#file1704004line50>
> >
> >     Is this expected? 
> >     Seems like this may generate wrong results since there might be multiple tasks
for Reducers each of which emit 1 row. Limit in fetch operator is needed.
> 
> pengcheng xiong wrote:
>     Yes, it is. In case of order by, only 1 reducer is used, so no need of another shuffle.

Correct. Whats the reason for this? Is that because we got rid of Order by on Calcite tree
itself?


> On May 3, 2017, 4:24 a.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/llap/vector_coalesce.q.out
> > Line 447 (original)
> > <https://reviews.apache.org/r/56140/diff/8/?file=1704013#file1704013line461>
> >
> >     No RS for order by.
> 
> pengcheng xiong wrote:
>     This is actually improvement. The query is {SELECT cfloat, cbigint, coalesce(cfloat,
cbigint, 0) as c
>     FROM alltypesorc
>     WHERE (cfloat IS NULL AND cbigint IS NULL)
>     ORDER BY cfloat, cbigint, c
>     LIMIT 10;}
>     You can see that, cfloat, cbigint, c are all nulls....
>     
>     The op tree is like this
>     
>     HiveSortLimit(offset=[0], fetch=[10])
>       HiveProject(cfloat=[$0], cbigint=[$1], c=[$2])
>         HiveSortLimit(sort0=[$0], sort1=[$1], sort2=[$2], dir0=[ASC-nulls-first], dir1=[ASC-nulls-first],
dir2=[ASC-nulls-first])
>           HiveProject(cfloat=[$4], cbigint=[$3], c=[coalesce($4, $3, 0)], ctinyint=[$0],
csmallint=[$1], cint=[$2], cbigint1=[$3], cfloat1=[$4], cdouble=[$5], cstring1=[$6], cstring2=[$7],
ctimestamp1=[$8], ctimestamp2=[$9], cboolean1=[$10], cboolean2=[$11], block__offset__inside__file=[$12],
input__file__name=[$13], row__id=[$14])
>             HiveFilter(condition=[AND(IS NULL($4), IS NULL($3))])
>               HiveTableScan(table=[[default.alltypesorc]], table:alias=[alltypesorc])
>     
>     
>     After running HiveProjectFilterPullUpConstantsRule and HiveReduceExpressionsRule,
we get rid of the order by...

There is limit also in query. After this change, limit is executed in map side (potentially
multiple tasks) but since FetchOperator doesnt have a limit, limit may not be honored.


> On May 3, 2017, 4:24 a.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/vector_date_1.q.out
> > Lines 598-607 (original)
> > <https://reviews.apache.org/r/56140/diff/8/?file=1704046#file1704046line598>
> >
> >     This plan looks incorrect. For an order by there should necessarily be a RS
in plan, otherwise we can get sorting in map only plan.
> 
> pengcheng xiong wrote:
>     dt1 is constant.

Is this change because we optimized away order by on calcite tree?


- Ashutosh


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56140/#review173690
-----------------------------------------------------------


On May 1, 2017, 5:30 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56140/
> -----------------------------------------------------------
> 
> (Updated May 1, 2017, 5:30 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-15160
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveProjectSortTransposeRule.java
1487ed4f8e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 1b054a7e24 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/RowResolver.java 262dafb487 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 654f3b1772 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java 8f8eab0d9c 
>   ql/src/test/queries/clientpositive/order_by_expr_1.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/order_by_expr_2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/annotate_stats_select.q.out 873f1abb25 
>   ql/src/test/results/clientpositive/cp_sel.q.out 1778ccd6a6 
>   ql/src/test/results/clientpositive/druid_basic2.q.out 6177d56987 
>   ql/src/test/results/clientpositive/dynamic_rdd_cache.q.out 2abb819558 
>   ql/src/test/results/clientpositive/groupby_grouping_sets_grouping.q.out 473d17a1bd

>   ql/src/test/results/clientpositive/llap/bucket_groupby.q.out d724131fca 
>   ql/src/test/results/clientpositive/llap/explainuser_1.q.out 584c3b5520 
>   ql/src/test/results/clientpositive/llap/limit_pushdown.q.out dd54dd22a6 
>   ql/src/test/results/clientpositive/llap/limit_pushdown3.q.out 24645b6426 
>   ql/src/test/results/clientpositive/llap/offset_limit_ppd_optimizer.q.out 83de1fbea1

>   ql/src/test/results/clientpositive/llap/vector_coalesce.q.out 578f849bdb 
>   ql/src/test/results/clientpositive/llap/vector_date_1.q.out a4f1050c89 
>   ql/src/test/results/clientpositive/llap/vector_decimal_2.q.out 144356c108 
>   ql/src/test/results/clientpositive/llap/vector_decimal_round.q.out 8bd80cf860 
>   ql/src/test/results/clientpositive/llap/vector_groupby_grouping_sets_grouping.q.out
5af9e61b0a 
>   ql/src/test/results/clientpositive/llap/vector_groupby_grouping_sets_limit.q.out f731ceecdc

>   ql/src/test/results/clientpositive/llap/vector_interval_1.q.out debf5ab39e 
>   ql/src/test/results/clientpositive/llap/vector_interval_arithmetic.q.out aadb6e72cd

>   ql/src/test/results/clientpositive/order3.q.out 898f7a8853 
>   ql/src/test/results/clientpositive/order_by_expr_1.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/order_by_expr_2.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/pcr.q.out a1301fdf79 
>   ql/src/test/results/clientpositive/perf/query31.q.out 3ed312d3e3 
>   ql/src/test/results/clientpositive/perf/query36.q.out 57ab26acc6 
>   ql/src/test/results/clientpositive/perf/query39.q.out 19472c4d5e 
>   ql/src/test/results/clientpositive/perf/query42.q.out 3bebac3321 
>   ql/src/test/results/clientpositive/perf/query52.q.out 74ecaf28ba 
>   ql/src/test/results/clientpositive/perf/query64.q.out 6b42393aad 
>   ql/src/test/results/clientpositive/perf/query66.q.out 072bfee92b 
>   ql/src/test/results/clientpositive/perf/query70.q.out 8e42fac9c5 
>   ql/src/test/results/clientpositive/perf/query75.q.out b1e236d325 
>   ql/src/test/results/clientpositive/perf/query81.q.out a09d5c99b5 
>   ql/src/test/results/clientpositive/perf/query85.q.out 168bcd2a4a 
>   ql/src/test/results/clientpositive/perf/query86.q.out 734e6a480b 
>   ql/src/test/results/clientpositive/perf/query89.q.out 66481f710b 
>   ql/src/test/results/clientpositive/perf/query91.q.out e592bba8d9 
>   ql/src/test/results/clientpositive/pointlookup2.q.out 3438c74608 
>   ql/src/test/results/clientpositive/pointlookup3.q.out 2c3e39fd15 
>   ql/src/test/results/clientpositive/ppd_udf_case.q.out 7678d03415 
>   ql/src/test/results/clientpositive/spark/dynamic_rdd_cache.q.out 6572511967 
>   ql/src/test/results/clientpositive/spark/limit_pushdown.q.out ede0096c73 
>   ql/src/test/results/clientpositive/spark/pcr.q.out 77ac020d07 
>   ql/src/test/results/clientpositive/vector_coalesce.q.out f158236beb 
>   ql/src/test/results/clientpositive/vector_date_1.q.out c2389e6b1e 
>   ql/src/test/results/clientpositive/vector_decimal_round.q.out de49c170cf 
>   ql/src/test/results/clientpositive/vector_interval_1.q.out f53a2c2db5 
>   ql/src/test/results/clientpositive/vector_interval_arithmetic.q.out 75250e30a4 
>   ql/src/test/results/clientpositive/view_alias.q.out 90bf28dd9b 
> 
> 
> Diff: https://reviews.apache.org/r/56140/diff/8/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message