hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashutosh Chauhan <hashut...@apache.org>
Subject Re: Review Request 60116: HIVE-16885
Date Mon, 19 Jun 2017 16:02:38 GMT


> On June 19, 2017, 3:21 p.m., Jesús Camacho Rodríguez wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 1092 (patched)
> > <https://reviews.apache.org/r/60116/diff/1/?file=1751804#file1751804line1092>
> >
> >     The problem is not cross-joins, but all kind of inner joins.
> >     
> >     For instance, consider _JOIN ON (a=b and c>10)_. With this property set to
true, _a_ and _b_ are keys and _c>10_ is the residual. Thus, the plan (in fact, the work
containing the JOIN) will not be vectorized, however if optimization is disabled, this would
not happen.
> >     
> >     My idea was to create a follow-up to close the gap for vectorization and then
enable it by default. Another option would be to push the residual within the join only for
cross joins and lift the restriction when the vectorization support for residual predicates
is there. What do you think?

That still depends on data distribution. Non-vectorized path may still be faster.
I think what we shall do is to a) turn this on for tests via data/conf/llap/hive-site.xml
and data/conf/tez/hive-site.xml so that we get coverage in tests from compiler side and b)
Create a follow-up jira for vectorization work.


- Ashutosh


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60116/#review178249
-----------------------------------------------------------


On June 15, 2017, 10:49 a.m., Jesús Camacho Rodríguez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60116/
> -----------------------------------------------------------
> 
> (Updated June 15, 2017, 10:49 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-16885
>     https://issues.apache.org/jira/browse/HIVE-16885
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-16885
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java fce8db3df1026de8b6ee8c59567e55db40696217

>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 6651900e79a5c3d4ad8329afbe3894544ce9f46e

>   ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java 07fd653dedc9a98d89b492ae6b49da70984569f7

>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 737aad1b764ee6487b420f2b9ea651c42e08e9bf

>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
fc6adafa0ebd0bd49d59cd0f4a82f70e9646ca6d 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 9e84a29470c481d932d4f2d12e2898e05a925e5b

>   ql/src/test/queries/clientpositive/join47.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/mapjoin47.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/smb_mapjoin_47.q PRE-CREATION 
>   ql/src/test/results/clientpositive/join47.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/mapjoin47.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/smb_mapjoin_47.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/60116/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jesús Camacho Rodríguez
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message