hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashutosh Chauhan <hashut...@apache.org>
Subject Re: Review Request 54517: HIVE-15192: Subquery support with Calcite
Date Tue, 13 Dec 2016 01:51:14 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/54517/#review158889
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/lib/SubQueryWalker.java (line 27)
<https://reviews.apache.org/r/54517/#comment229712>

    Better name: Expression walker?



ql/src/java/org/apache/hadoop/hive/ql/lib/SubQueryWalker.java (line 40)
<https://reviews.apache.org/r/54517/#comment229738>

    When is later? 
    SubQueryExprProcessor already handles TOK_SUBQUERY_OP so why do we need to bypass subquery
processing?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelShuttle.java (lines 26 - 29)
<https://reviews.apache.org/r/54517/#comment229701>

    Calcite's RelShuttle should work on Project and not LogicalProject. Can you raise this
on calcite list, seems like that interface needs to be enhanced.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveReplicatedRelBuilder.java (line
111)
<https://reviews.apache.org/r/54517/#comment229697>

    Add a TODO about removing this class once we upgrade caclite to 1.10 referencing calcite
jira.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveReplicatedRelBuilder.java (lines
141 - 156)
<https://reviews.apache.org/r/54517/#comment229739>

    These should be Hive Rel Factories, not default.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveReplicatedRelBuilder.java (lines
173 - 187)
<https://reviews.apache.org/r/54517/#comment229740>

    Can be removed.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveReplicatedRelBuilder.java (lines
215 - 224)
<https://reviews.apache.org/r/54517/#comment229741>

    Can be removed.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveFilter.java (line
64)
<https://reviews.apache.org/r/54517/#comment229699>

    Wont this assert fail (correctly) for non-correlated variables?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveFilter.java (line
82)
<https://reviews.apache.org/r/54517/#comment229698>

    Each HiveFilter will be visited by RelShuttle. Why do we need to traverse the tree here?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (line
116)
<https://reviews.apache.org/r/54517/#comment229702>

    Can you also add comments on what needs to be done before we can get rid of this class
and rely solely on Calcite's RelDecorrelator?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (lines
214 - 218)
<https://reviews.apache.org/r/54517/#comment229751>

    Do we need to fire all these rules? Some of the rules are already used during optimization
of calcite tree. Shall we just call those rules before calling Decorrelator? Will avoid calling
rules twice.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (lines
233 - 234)
<https://reviews.apache.org/r/54517/#comment229752>

    Similarly these rules can be called after decorrelator on whole tree from Calcite planner.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (line
283)
<https://reviews.apache.org/r/54517/#comment229753>

    can be called before Decorrelator?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (line
363)
<https://reviews.apache.org/r/54517/#comment229755>

    needs to be overloaded with hivesortlimit argument?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (line
419)
<https://reviews.apache.org/r/54517/#comment229756>

    This can be removed since we alreafy have overloaded version of this with HiveAggregate.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (line
701)
<https://reviews.apache.org/r/54517/#comment229758>

    Need to create hiveproject here.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (line
860)
<https://reviews.apache.org/r/54517/#comment229757>

    Can be removed.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (lines
1053 - 1054)
<https://reviews.apache.org/r/54517/#comment229760>

    Can this be removed? If so, what above comment above?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (line
1081)
<https://reviews.apache.org/r/54517/#comment229761>

    HiveJoin needs to be created.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (line
1144)
<https://reviews.apache.org/r/54517/#comment229762>

    Can be removed.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (line
1181)
<https://reviews.apache.org/r/54517/#comment229788>

    HiveFilter?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (line
1355)
<https://reviews.apache.org/r/54517/#comment229789>

    can be removed.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (line
1507)
<https://reviews.apache.org/r/54517/#comment229790>

    HiveProject.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (line
1554)
<https://reviews.apache.org/r/54517/#comment229791>

    hive project



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (line
1931)
<https://reviews.apache.org/r/54517/#comment229793>

    This should be Aggregate and Project instead.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (line
2036)
<https://reviews.apache.org/r/54517/#comment229794>

    instanceof Filter, instead.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java (lines
2182 - 2185)
<https://reviews.apache.org/r/54517/#comment229795>

    types of operands.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveSubQueryRemoveRule.java
(line 56)
<https://reviews.apache.org/r/54517/#comment229703>

    Add notes on what needs to be done to eliminate this class?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveSubQueryRemoveRule.java
(line 70)
<https://reviews.apache.org/r/54517/#comment229745>

    HiveRelBuilder?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveSubQueryRemoveRule.java
(line 77)
<https://reviews.apache.org/r/54517/#comment229746>

    This assert can fail if there is no subquery, no?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveSubQueryRemoveRule.java
(lines 181 - 197)
<https://reviews.apache.org/r/54517/#comment229747>

    Remove commented code?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java (line
123)
<https://reviews.apache.org/r/54517/#comment229704>

    Add a comment on when this outerRR comes in a play?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java (line
132)
<https://reviews.apache.org/r/54517/#comment229705>

    spelling



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java (line
204)
<https://reviews.apache.org/r/54517/#comment229706>

    throwing exception here is better here, since we don't expect to get in here.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java (line
510)
<https://reviews.apache.org/r/54517/#comment229800>

    in case of mal-formed query.. you may not find a reference in outerRR. Should check pos
and throw exception if its not valid.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java (line
513)
<https://reviews.apache.org/r/54517/#comment229707>

    this should be RexCorrelVariable, not RexFieldAccess? No?



ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java (line 2056)
<https://reviews.apache.org/r/54517/#comment229803>

    should generate a new id for new QB.



ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java (lines 542 - 550)
<https://reviews.apache.org/r/54517/#comment229708>

    Can just remove this block.



ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckCtx.java (line 43)
<https://reviews.apache.org/r/54517/#comment229805>

    I think outerRR design will work only if subquery references columns of its parent. What
if subquery references columns of grandparent? It will be good to leave a TODO for this case.



ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeSubQueryDesc.java (line 27)
<https://reviews.apache.org/r/54517/#comment229709>

    Update description.



ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeSubQueryDesc.java (line 29)
<https://reviews.apache.org/r/54517/#comment229711>

    I am not sure about ExprNodeSubQueryDesc class since it encapsulates both calcite as well
as Hive data structure. Till Now, ExprNodeDesc is exclusively Hive concept. 
    Is it necessary to mix these two?



ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeSubQueryDesc.java (lines 32 - 33)
<https://reviews.apache.org/r/54517/#comment229710>

    use enums



ql/src/test/queries/clientpositive/cbo_rp_auto_join1.q (line 3)
<https://reviews.apache.org/r/54517/#comment229806>

    Remove it from q file and data conf flag in data/conf/hive-site.xml



ql/src/test/queries/clientpositive/semijoin5.q (line 1)
<https://reviews.apache.org/r/54517/#comment229807>

    Any reason for this?



ql/src/test/results/clientpositive/constant_prop_3.q.out (line 161)
<https://reviews.apache.org/r/54517/#comment229808>

    Plan change because of semi join flag?



ql/src/test/results/clientpositive/constprog_partitioner.q.out (lines 84 - 86)
<https://reviews.apache.org/r/54517/#comment229809>

    Semijoin flag?



ql/src/test/results/clientpositive/llap/cbo_rp_subq_in.q.out (lines 85 - 86)
<https://reviews.apache.org/r/54517/#comment229812>

    Need to disable this query file



ql/src/test/results/clientpositive/masking_3.q.out (lines 19 - 21)
<https://reviews.apache.org/r/54517/#comment229810>

    Need to set semijoin flag in q file.



ql/src/test/results/clientpositive/masking_4.q.out (lines 170 - 171)
<https://reviews.apache.org/r/54517/#comment229811>

    semi join flag.


- Ashutosh Chauhan


On Dec. 12, 2016, 6:17 p.m., Vineet Garg wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/54517/
> -----------------------------------------------------------
> 
> (Updated Dec. 12, 2016, 6:17 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-15192
>     https://issues.apache.org/jira/browse/HIVE-15192
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch is 1st phase of getting rid of Hive's subquery transformation and de-corelation
logic and leverage Calcite's functionality to plan sub-queries.
> 
> Known issues with this patch
> * Few return path tests are failing and are disabled with this patch.
> * Semi-join optimization (HIVE-15227) is disabled currently as it doesn't work with this
patch.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 9064e49 
>   itests/src/test/resources/testconfiguration.properties aa3d72d 
>   ql/src/java/org/apache/hadoop/hive/ql/lib/SubQueryWalker.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelShuttle.java PRE-CREATION

>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelShuttleImpl.java PRE-CREATION

>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveReplicatedRelBuilder.java
PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveFilter.java
0410c91 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveJoin.java
ba9483e 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/reloperators/HiveProject.java
3e0a9a6 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java
PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveSubQueryRemoveRule.java
PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveOpConverter.java
d494c9f 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/RexNodeConverter.java
f8fb475 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java f1f3bf9 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java 3458fb6 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 79e55b2 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckCtx.java 02896ff 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java ace3eaf 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeSubQueryDesc.java PRE-CREATION 
>   ql/src/test/queries/clientnegative/subquery_nested_subquery.q  
>   ql/src/test/queries/clientnegative/subquery_restrictions.q PRE-CREATION 
>   ql/src/test/queries/clientnegative/subquery_shared_alias.q  
>   ql/src/test/queries/clientpositive/cbo_rp_auto_join1.q cbfb5d5 
>   ql/src/test/queries/clientpositive/join31.q c79105f 
>   ql/src/test/queries/clientpositive/multiMapJoin2.q cf5dbb0 
>   ql/src/test/queries/clientpositive/semijoin5.q 3e7c20a 
>   ql/src/test/queries/clientpositive/subquery_in.q c01ae70 
>   ql/src/test/queries/clientpositive/subquery_notin.q 3f4fb7f 
>   ql/src/test/queries/clientpositive/subquery_notin_having.q 05148df 
>   ql/src/test/results/clientnegative/subquery_in_groupby.q.out 809bb0a 
>   ql/src/test/results/clientnegative/subquery_in_select.q.out 3d74132 
>   ql/src/test/results/clientnegative/subquery_nested_subquery.q.out 140b093 
>   ql/src/test/results/clientnegative/subquery_restrictions.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/constant_prop_3.q.out 58f1065 
>   ql/src/test/results/clientpositive/constprog_partitioner.q.out d1016ad 
>   ql/src/test/results/clientpositive/llap/cbo_rp_subq_in.q.out f6bfad2 
>   ql/src/test/results/clientpositive/llap/dynamic_partition_pruning.q.out f993cf0 
>   ql/src/test/results/clientpositive/llap/explainuser_1.q.out 70ec02f 
>   ql/src/test/results/clientpositive/llap/lineage3.q.out 1a532da 
>   ql/src/test/results/clientpositive/llap/subquery_exists.q.out 1a006d8 
>   ql/src/test/results/clientpositive/llap/subquery_in.q.out 321f1cc 
>   ql/src/test/results/clientpositive/llap/subquery_nested_subquery.q.out PRE-CREATION

>   ql/src/test/results/clientpositive/llap/subquery_notin.q.out 3da1acb 
>   ql/src/test/results/clientpositive/llap/subquery_shared_alias.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/subquery_views.q.out 35e80ae 
>   ql/src/test/results/clientpositive/llap/vector_groupby_mapjoin.q.out ff658d7 
>   ql/src/test/results/clientpositive/llap/vector_mapjoin_reduce.q.out a075662 
>   ql/src/test/results/clientpositive/llap/vectorized_dynamic_partition_pruning.q.out
76c8404 
>   ql/src/test/results/clientpositive/masking_3.q.out 1925dce 
>   ql/src/test/results/clientpositive/masking_4.q.out 7e923e8 
>   ql/src/test/results/clientpositive/perf/query45.q.out 7bc137c 
>   ql/src/test/results/clientpositive/perf/query70.q.out 611af74 
>   ql/src/test/results/clientpositive/semijoin4.q.out 3c065a9 
>   ql/src/test/results/clientpositive/semijoin5.q.out 63b477c 
>   ql/src/test/results/clientpositive/spark/constprog_partitioner.q.out 567c6d3 
>   ql/src/test/results/clientpositive/spark/subquery_exists.q.out b58fcbe 
>   ql/src/test/results/clientpositive/spark/subquery_in.q.out 21a48ec 
>   ql/src/test/results/clientpositive/spark/vector_mapjoin_reduce.q.out 012c3eb 
>   ql/src/test/results/clientpositive/subq_where_serialization.q.out f689651 
>   ql/src/test/results/clientpositive/subquery_exists.q.out 86f9089 
>   ql/src/test/results/clientpositive/subquery_exists_having.q.out 8861c82 
>   ql/src/test/results/clientpositive/subquery_in_having.q.out 854aa36 
>   ql/src/test/results/clientpositive/subquery_notexists.q.out ede7855 
>   ql/src/test/results/clientpositive/subquery_notexists_having.q.out 9349f2d 
>   ql/src/test/results/clientpositive/subquery_notin_having.q.out 804f411 
>   ql/src/test/results/clientpositive/subquery_unqualcolumnrefs.q.out c7e1f02 
>   ql/src/test/results/clientpositive/vector_groupby_mapjoin.q.out 3468657 
>   ql/src/test/results/clientpositive/vector_mapjoin_reduce.q.out 160b088 
> 
> Diff: https://reviews.apache.org/r/54517/diff/
> 
> 
> Testing
> -------
> 
> * Added new tests.
> * Pre-commit testing on-going
> 
> 
> Thanks,
> 
> Vineet Garg
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message