hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Kaldewey <>
Subject Number of map reduce jobs generated
Date Thu, 24 Mar 2011 21:29:54 GMT


I noticed that for join queries that comprise an aggregate hive generates a
query plan with two MR jobs, one does the join and the second the
aggregate. I was wondering if there is a way to hint hive to combine these
two operations in 1 MR job. I have attached an example of the set of
queries I am looking at.



select /*+ MAPJOIN(Table2) */ sum(t1_10 * t1_12)
  from Table1 join Table2 on (Table1.t1_6 = Table2.t2_1)
  where Table2.t2_5 = 1234
    and 8 <= Table1.t1_12 <= 10
    and Table1.t1_9 < 42;

to explain:
- table 2 is small, thus I choose a map-side (broadcast) join.
- when I remove the aggregate hive only generates 1MR job
View raw message