hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Szehon Ho" <>
Subject Review Request 19549: HIVE-6395 multi-table insert from select transform fails if optimize.ppd enabled
Date Fri, 21 Mar 2014 21:00:06 GMT

This is an automatically generated e-mail. To reply, visit:

Review request for hive.

Repository: hive-git


In this scenario, PPD on the script (transform) operator did the following wrong predicate

script --> filter (state=1)
           --> select, insert into test1
       -->filter (state=2)
           --> select, insert into test2


script --> filter (state=1 and state=2)   //not possible.
         --> select, insert into test1
         --> select, insert into test2

The bug was a combination of two things, first that these filters got chosen by FilterPPD
and that the ScriptPPD called the sequence "mergeWithChildrenPred /createFilters (pred)" which
did the above transformation.  ScriptPPD was one of the few simple operator that did this,
I tried with some other combination like extract (see my added test in transform_ppr2.q) and
also just a select operator.

The fix is to skip marking a predicate as a 'candidate' for the pushdown if it is a sibling
of another filter.  We still want to pushdown children of select transform with grandchildren,


  ql/src/java/org/apache/hadoop/hive/ql/ppd/ 40298e1 
  ql/src/test/queries/clientpositive/transform_ppd_multi.q PRE-CREATION 
  ql/src/test/queries/clientpositive/transform_ppr2.q 85ef3ac 
  ql/src/test/results/clientpositive/transform_ppd_multi.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/transform_ppr2.q.out 4bddc69 



Reproduced both the issue in transform_ppd_multi.q, also did another similar issue with an
extract (cluster) operator in transform_pp2.q.  Ran other transform_ppd and general ppd tests
to ensure no regression.


Szehon Ho

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message