hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashutosh Chauhan <hashut...@apache.org>
Subject Re: Review Request 59697: Fix HiveFilterAggregateTransposeRule when filter is always false
Date Sat, 03 Jun 2017 20:03:22 GMT


> On June 1, 2017, 9:08 p.m., pengcheng xiong wrote:
> > ql/src/test/results/clientpositive/perf/query4.q.out
> > Lines 436 (patched)
> > <https://reviews.apache.org/r/59697/diff/2/?file=1738886#file1738886line436>
> >
> >     The pattern TS-FIL(false) is here. I will take another look to see why it does
not fire.
> 
> pengcheng xiong wrote:
>     The reason is because of SharedScan. As all the branches in the union share the same
3 table scan, although only one of them will participate in the join, we still need to scan
the table and then branch out. You will see lots of filters out of the same scan in the explain
extended. If you turn off the shared scan, you will see the null scan optimization. This is
interesting....
> 
> Jesús Camacho Rodríguez wrote:
>     When SharedScan optimizer is executed, it seems we still do not have the information
about the branches that will be removed. Is there a chance to extend NullScanOptimizer to
remove those branches that will not contribute to result? Or is it even possible to move NullScanOptimizer
to be executed before SharedScanOptimizer?

Ideal thing to do here is to remove branches of tree corresponding to TS-FIL(false) But that
removal is non-trivial in general case since it may make tree and result schema invalid e.g.,
in case of joins. Thats why NullScan instead of removing branch just replaces TS with ZeroRowInputFormat.
We can remove them for union though which doesnt alter result schema and which is what HIVE-16797
is doing. Since that will happen early in logical planning, tree will be optimal by the time
Shared Scan or NullScan sees it. So, I think with the rules we have, plan currently generated
is optimal one we can generate for non-union cases like join like here. There is one minor
extension which can be done is to enhance Shared Scan to simplify generated OR expression
tree to drop false filter (and potentially other expression simplifications) after generating
combining OR filter but that as I said is minor.


- Ashutosh


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59697/#review176665
-----------------------------------------------------------


On June 1, 2017, 7:46 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59697/
> -----------------------------------------------------------
> 
> (Updated June 1, 2017, 7:46 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> hive-16775
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterAggregateTransposeRule.java
0e5c7313b6 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java 7f583ed075 
>   ql/src/test/queries/clientpositive/filter_aggr.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/perf/query4.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/perf/query74.q PRE-CREATION 
>   ql/src/test/results/clientpositive/filter_aggr.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/perf/query4.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/perf/query74.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/59697/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message