I saw the same result. But I debugged a little bit and figured out that it was the PPD optimizer
did the transitivity propagation, not PredicateTransitivePropagate.
------------------ Original ------------------
From: "Gopal Vijayaraghavan"<gopalv@apache.org>;
Date: Wed, Aug 19, 2015 02:52 PM
To: "user"<user@hive.apache.org>; "dev"<dev@hive.apache.org>;
Cc: "孙若曦"<ruoxi.sun@transwarp.io>;
Subject: Re: Question about PredicateTransitivePropagate
>select * from t1 join t2 on t1.col = t2.col where t1.col = 1;
> Is rule PredicateTransitivePropagate supposed to propagate predicate
>"t1.col = 1" to t2 via join condition t1.col = t2.col?
> Assuming so, I found that the predicate "t1.col = 1" has not been pushed
>down to table scan of t1, thus PredicateTransitivePropagate wouldn't see
>the predicate. Then I tried to put PredicateTransitivePropagate
after PredicatePushDown, I saw predicate "t1.col = 1" was propagated to
t2.
Are you trying a recent build?
I ran the exact same with tonight¹s hive-2.0 build, with two temp-tables
and got
create temporary table t1(x int, y int);
create temporary table t2(x int, y int);
explain select * from t1 left join t2 on t1.x = t2.x where (t1.x = 1 or
t1.x = 2) ;
Select Operator [SEL_6]
outputColumnNames:["_col0","_col1","_col2","_col3"]
Statistics:Num rows: 1 Data size: 0 Basic stats: PARTIAL
Column stats: NONE
Map Join Operator [MAPJOIN_11]
| condition map:[{"":"Left Outer Join0 to 1"}]
| keys:{"Map 2":"x (type: int)","Map 1":"x (type: int)"}
| outputColumnNames:["_col0","_col1","_col5","_col6"]
| Statistics:Num rows: 1 Data size: 0 Basic stats: PARTIAL
Column stats: NONE
|<-Map 2 [BROADCAST_EDGE]
| Reduce Output Operator [RS_3]
| key expressions:x (type: int)
| Map-reduce partition columns:x (type: int)
| sort order:+
| Statistics:Num rows: 1 Data size: 0 Basic stats:
PARTIAL Column stats: NONE
| value expressions:y (type: int)
| Filter Operator [FIL_10]
| predicate:((x = 1) or (x = 2)) (type: boolean)
| Statistics:Num rows: 1 Data size: 0 Basic stats:
PARTIAL Column stats: NONE
| TableScan [TS_1]
| alias:t2
| Statistics:Num rows: 1 Data size: 0 Basic
stats: PARTIAL Column stats: NONE
|<-Filter Operator [FIL_9]
predicate:((x = 1) or (x = 2)) (type: boolean)
Statistics:Num rows: 1 Data size: 0 Basic stats:
PARTIAL Column stats: NONE
TableScan [TS_0]
alias:t1
Statistics:Num rows: 1 Data size: 0 Basic stats:
PARTIAL Column stats: NONE
Cheers,
Gopal |