hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Navis Ryu" <navis....@nexr.com>
Subject Re: Review Request 16728: Implement non-staged MapJoin
Date Mon, 13 Jan 2014 04:43:05 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16728/
-----------------------------------------------------------

(Updated Jan. 13, 2014, 4:43 a.m.)


Review request for hive.


Changes
-------

Rebased to trunk & added entry for hive-defalt.xml.template


Bugs: HIVE-6144
    https://issues.apache.org/jira/browse/HIVE-6144


Repository: hive-git


Description
-------

For map join, all data in small aliases are hashed and stored into temporary file in MapRedLocalTask.
But for some aliases without filter or projection, it seemed not necessary to do that. For
example.

{noformat}
select a.* from src a join src b on a.key=b.key;
{noformat}

makes plan like this.
{noformat}
STAGE PLANS:
  Stage: Stage-4
    Map Reduce Local Work
      Alias -> Map Local Tables:
        a 
          Fetch Operator
            limit: -1
      Alias -> Map Local Operator Tree:
        a 
          TableScan
            alias: a
            HashTable Sink Operator
              condition expressions:
                0 {key} {value}
                1 
              handleSkewJoin: false
              keys:
                0 [Column[key]]
                1 [Column[key]]
              Position of Big Table: 1

  Stage: Stage-3
    Map Reduce
      Alias -> Map Operator Tree:
        b 
          TableScan
            alias: b
            Map Join Operator
              condition map:
                   Inner Join 0 to 1
              condition expressions:
                0 {key} {value}
                1 
              handleSkewJoin: false
              keys:
                0 [Column[key]]
                1 [Column[key]]
              outputColumnNames: _col0, _col1
              Position of Big Table: 1
              Select Operator
                File Output Operator
      Local Work:
        Map Reduce Local Work
  Stage: Stage-0
    Fetch Operator
{noformat}

table src(a) is fetched and stored as-is in MRLocalTask. With this patch, plan can be like
below.
{noformat}
  Stage: Stage-3
    Map Reduce
      Alias -> Map Operator Tree:
        b 
          TableScan
            alias: b
            Map Join Operator
              condition map:
                   Inner Join 0 to 1
              condition expressions:
                0 {key} {value}
                1 
              handleSkewJoin: false
              keys:
                0 [Column[key]]
                1 [Column[key]]
              outputColumnNames: _col0, _col1
              Position of Big Table: 1
              Select Operator
                  File Output Operator
      Local Work:
        Map Reduce Local Work
          Alias -> Map Local Tables:
            a 
              Fetch Operator
                limit: -1
          Alias -> Map Local Operator Tree:
            a 
              TableScan
                alias: a
          Has Any Stage Alias: false
  Stage: Stage-0
    Fetch Operator
{noformat}


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 16d54c6 
  conf/hive-default.xml.template d188f2a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java d8f4eb4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableLoader.java a080fcc 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java aa8f19c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 1e0314d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java bdc85b9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TemporaryHashSinkOperator.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 5511bca 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HashTableLoader.java efe5710 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 0cc90d0 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java 5a53e15

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java 010ac54 
  ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java 14fced7 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapredLocalWork.java 83a778d 
  ql/src/test/queries/clientpositive/auto_join_without_localtask.q PRE-CREATION 
  ql/src/test/results/clientpositive/auto_join_without_localtask.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/16728/diff/


Testing
-------


Thanks,

Navis Ryu


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message