hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From wzc1989 <wzc1...@gmail.com>
Subject 回复: different outer join plan between hive 0.9 and hive 0.10
Date Mon, 13 May 2013 16:11:52 GMT
This time i cherry-pick HIVE-3464, HIVE-4212, HIVE-4206 and some related commits and the above
explain result matches in hive 0.9 and hive 0.10, thanks!
But I confuse about this error msg:

 JOINNODE_OUTERJOIN_MORETHAN_16(10142, "Single join node containing outer join(s) " +
      "cannot have more than 16 aliases"),


does this mean in hive0.10 when we have more than 16 outer join the query plan will still
have some bug?
I test the sql below and find the explain result still diff between hive 0.9 and hive 0.10.
 

explain select
sum(a.value) val
from default.test_join a
left outer join default.test_join b on a.key = b.key
left outer join default.test_join c on a.key = c.key
left outer join default.test_join d on a.key = d.key
left outer join default.test_join e on a.key = e.key
left outer join default.test_join f on a.key = f.key
left outer join default.test_join g on a.key = g.key
left outer join default.test_join h on a.key = h.key
left outer join default.test_join i on a.key = i.key
left outer join default.test_join j on a.key = j.key
left outer join default.test_join k on a.key = k.key
left outer join default.test_join l on a.key = l.key
left outer join default.test_join m on a.key = m.key
left outer join default.test_join n on a.key = n.key
left outer join default.test_join u on a.key = u.key
left outer join default.test_join v on a.key = v.key
left outer join default.test_join w on a.key = w.key
left outer join default.test_join x on a.key = x.key
left outer join default.test_join z on a.key = z.key



--  
wzc1989
已使用 Sparrow (http://www.sparrowmailapp.com/?sig)


在 2013年3月29日星期五,上午9:34,Navis류승우 写道:

> The problem is mixture of issues (HIVE-3411, HIVE-4209, HIVE-4212,
> HIVE-3464) and still not completely fixed even in trunk.
>  
> Will be fixed shortly.
>  
> 2013/3/29 wzc <wzc1989@gmail.com (mailto:wzc1989@gmail.com)>:
> > The bug remains even if I apply the patch in HIVE-4206 :( The explain
> > result hasn't change.
> >  
> >  
> > 2013/3/28 Navis류승우 <navis.ryu@nexr.com (mailto:navis.ryu@nexr.com)>
> > >  
> > > It's a bug (https://issues.apache.org/jira/browse/HIVE-4206).
> > >  
> > > Thanks for reporting it.
> > >  
> > > 2013/3/24 wzc <wzc1989@gmail.com (mailto:wzc1989@gmail.com)>:
> > > > Recently we tried to upgrade our hive from 0.9 to 0.10, but found some
> > > > of
> > > > our hive queries almost 7 times slow. One of such query consists
> > > > multiple
> > > > table outer join on the same key. By looking into the query, we found
> > > > the
> > > > query plans generate by hive 0.9 and hive 0.10 are different. Here is
> > > > the
> > > > example:
> > > >  
> > > > testcase:
> > > >  
> > > > use default;
> > > > create table test_join (
> > > > `key` string,
> > > > `value` string
> > > > );
> > > >  
> > > > explain select
> > > > sum(a.value) val
> > > > from default.test_join a
> > > > left outer join default.test_join b on a.key = b.key
> > > > left outer join default.test_join c on a.key = c.key
> > > > left outer join default.test_join d on a.key = d.key
> > > > left outer join default.test_join e on a.key = e.key
> > > > left outer join default.test_join f on a.key = f.key
> > > > left outer join default.test_join g on a.key = g.key
> > > >  
> > > >  
> > > > the explain of hive 0.9:
> > > >  
> > > > STAGE DEPENDENCIES:
> > > >  
> > > > Stage-1 is a root stage
> > > >  
> > > > Stage-2 depends on stages: Stage-1
> > > >  
> > > > Stage-0 is a root stage
> > > >  
> > > > ...
> > > >  
> > > > Reduce Operator Tree:
> > > >  
> > > > Join Operator
> > > >  
> > > > condition map:
> > > >  
> > > > Left Outer Join0 to 1
> > > >  
> > > > Left Outer Join0 to 2
> > > >  
> > > > Left Outer Join0 to 3
> > > >  
> > > > Left Outer Join0 to 4
> > > >  
> > > > Left Outer Join0 to 5
> > > >  
> > > > Left Outer Join0 to 6
> > > >  
> > > > condition expressions:
> > > >  
> > > > 0 {VALUE._col1}
> > > >  
> > > > 1
> > > >  
> > > > 2
> > > >  
> > > > 3
> > > >  
> > > > 4
> > > >  
> > > > 5
> > > >  
> > > > 6
> > > >  
> > > > ......
> > > >  
> > > >  
> > > > while the explain of hive 0.10:
> > > >  
> > > > STAGE DEPENDENCIES:
> > > >  
> > > > Stage-6 is a root stage
> > > >  
> > > > Stage-1 depends on stages: Stage-6
> > > >  
> > > > Stage-2 depends on stages: Stage-1
> > > >  
> > > > Stage-0 is a root stage
> > > >  
> > > > ...
> > > >  
> > > > Reduce Operator Tree:
> > > >  
> > > > Join Operator
> > > >  
> > > > condition map:
> > > >  
> > > > Left Outer Join0 to 1
> > > >  
> > > > Left Outer Join0 to 2
> > > >  
> > > > condition expressions:
> > > >  
> > > > 0 {VALUE._col0} {VALUE._col1}
> > > >  
> > > > 1
> > > >  
> > > > 2
> > > >  
> > > > ...
> > > >  
> > > > Reduce Operator Tree:
> > > >  
> > > > Join Operator
> > > >  
> > > > condition map:
> > > >  
> > > > Left Outer Join0 to 1
> > > >  
> > > > Left Outer Join0 to 2
> > > >  
> > > > Left Outer Join0 to 3
> > > >  
> > > > Left Outer Join0 to 4
> > > >  
> > > > condition expressions:
> > > >  
> > > > 0 {VALUE._col9}
> > > >  
> > > > 1
> > > >  
> > > > 2
> > > >  
> > > > 3
> > > >  
> > > > 4
> > > >  
> > > > ....
> > > >  
> > > >  
> > > > It seems like hive 0.9 use only one stage/job to process all outer joins
> > > > but
> > > > hive 0.10 split them into two stage. When running such kind of query on
> > > > hive0.10 in production, in the second stage of outer join process, some
> > > > reducer stucks.
> > > >  
> > > > I can't find any param to change the query plain , can anyone give me
> > > > some
> > > > hint?
> > > >  
> > > > Thanks!  


Mime
View raw message