hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Szehon Ho" <>
Subject Review Request 30443: HIVE-9192 : One-pass SMB Optimizations [Spark Branch]
Date Fri, 30 Jan 2015 03:27:43 GMT

This is an automatically generated e-mail. To reply, visit:

Review request for hive and Xuefu Zhang.

Repository: hive-git


This patch refactors SMB MapJoin optimizations in Spark to be one-pass.  The main part of
SMB MapJoin optimization is to annotate the MapWork with the information from SMBMapJoinOperator
and its roots (TableScans).

Instead of doing MapWork init/annotation in the SparkSortMergeJoinFactory in a second pass,
now both GenSparkWork and SparkSortMergeJoinFactory classes collect information.  After the
one-pass, we go through all the SMBJoinOperators and annotate their mapworks.


  ql/src/java/org/apache/hadoop/hive/ql/optimizer/spark/ 6e0ac38

  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/ 773cfbd 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/ 0eac6e1 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/ cb5d4fe 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/ 3a7477a 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/ PRE-CREATION




Szehon Ho

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message