hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chao Sun" <>
Subject Review Request 30388: HIVE-9103 - Support backup task for join related optimization [Spark Branch]
Date Thu, 29 Jan 2015 01:05:44 GMT

This is an automatically generated e-mail. To reply, visit:

Review request for hive and Xuefu Zhang.

Bugs: HIVE-9103

Repository: hive-git


This patch adds backup task to map join task. The backup task, which uses common join, will
be triggered
in case the mapjoin task failed.

Note that, no matter how many map joins there are in the SparkTask, we will only generate
one backup task.
This means that if the original task failed at the very last map join, the whole task will
be re-executed.

The handling of backup task is a little bit different from what MR does, mostly because we
convert JOIN to
MAPJOIN during the operator plan optimization phase, at which time no task/work exist yet.
In the patch, we
cloned the whole operator tree before the JOIN operator is converted. The operator tree will
be processed
and generate a separate work tree for a separate backup SparkTask.


  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/ 69004dc

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/ 79c3e02

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/ d57ceff 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/ 9ff47c7

  ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/ 6e0ac38

  ql/src/java/org/apache/hadoop/hive/ql/parse/ b838bff 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/ 773cfbd 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/ f7586a4

  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/ 3a7477a 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ 0e85990 
  ql/src/test/results/clientpositive/spark/auto_join25.q.out ab01b8a 





Chao Sun

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message