hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang" <xzh...@cloudera.com>
Subject Re: Review Request 19549: HIVE-6395 multi-table insert from select transform fails if optimize.ppd enabled
Date Fri, 21 Mar 2014 21:54:59 GMT


> On March 21, 2014, 9:50 p.m., Xuefu Zhang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java, line 180
> > <https://reviews.apache.org/r/19549/diff/1/?file=531817#file531817line180>
> >
> >     Just for my understanding, for the given example, what's the filterOp, what's
the parent, and what are the siblings?
> 
> Szehon Ho wrote:
>     Hi Xuefu, thanks for looking.  Like in my ascii diagram above, filter op is the (Filter).
 The parent is the script operator.

I guess "script" is the parent, based your comments.


- Xuefu


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19549/#review38215
-----------------------------------------------------------


On March 21, 2014, 9:05 p.m., Szehon Ho wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19549/
> -----------------------------------------------------------
> 
> (Updated March 21, 2014, 9:05 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> In this scenario, PPD on the script (transform) operator did the following wrong predicate
pushdown:
> 
> script --> filter (state=1)
>            --> select, insert into test1
>        -->filter (state=2)
>            --> select, insert into test2
> 
> into:
> 
> script --> filter (state=1 and state=2)   //not possible.
>          --> select, insert into test1
>          --> select, insert into test2
> 
> 
> The bug was a combination of two things, first that these filters got chosen by FilterPPD
as 'candidate' pushdown precdicates, and that the ScriptPPD called  "mergeWithChildrenPred
+ createFilters" which did the above transformation due to them being marked.  
> 
> ScriptPPD was one of the few simple operator that did this, I tried with some other parent
operator like extract (see my added test in transform_ppr2.q) and also just a select operator
and could not produce the issue with those.
> 
> The fix is to skip marking a predicate as a 'candidate' for the pushdown if it is a sibling
of another filter.  We still want to pushdown children of transform-operator with grandchildren,
etc.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 
>   ql/src/test/queries/clientpositive/transform_ppd_multi.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/transform_ppr2.q 85ef3ac 
>   ql/src/test/results/clientpositive/transform_ppd_multi.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/transform_ppr2.q.out 4bddc69 
> 
> Diff: https://reviews.apache.org/r/19549/diff/
> 
> 
> Testing
> -------
> 
> Reproduced both the issue in transform_ppd_multi.q, also did another similar issue with
an extract (cluster) operator in transform_pp2.q.  Ran other transform_ppd and general ppd
tests to ensure no regression.
> 
> 
> Thanks,
> 
> Szehon Ho
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message