hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Szehon Ho (JIRA)" <>
Subject [jira] [Updated] (HIVE-8639) Convert SMBJoin to MapJoin [Spark Branch]
Date Wed, 17 Dec 2014 23:57:14 GMT


Szehon Ho updated HIVE-8639:
    Attachment: HIVE-8639.3-spark.patch

I think there was a temporary issue downloading the dependency for build, attaching same patch

Also updated the review-board.

> Convert SMBJoin to MapJoin [Spark Branch]
> -----------------------------------------
>                 Key: HIVE-8639
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: spark-branch
>            Reporter: Szehon Ho
>            Assignee: Szehon Ho
>         Attachments: HIVE-8639.1-spark.patch, HIVE-8639.2-spark.patch, HIVE-8639.3-spark.patch,
> HIVE-8202 supports auto-conversion of SMB Join.  However, if the tables are partitioned,
there could be a slow down as each mapper would need to get a very small chunk of a partition
which has a single key. Thus, in some scenarios it's beneficial to convert SMB join to map
> The task is to research and support the conversion from SMB join to map join for Spark
execution engine.  See the equivalent of MapReduce in SortMergeJoinResolver.

This message was sent by Atlassian JIRA

View raw message