hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Szehon Ho (JIRA)" <>
Subject [jira] [Updated] (HIVE-8943) Fix memory limit check for combine nested mapjoins [Spark Branch]
Date Wed, 26 Nov 2014 22:11:12 GMT


Szehon Ho updated HIVE-8943:
    Attachment: HIVE-8943.2-spark.patch

Giving another try.

The refactoring of the big-table calculation algorithm had made it choose different big tables
if more than one is available, tweaked the algorithm to choose the same one to minimize the

> Fix memory limit check for combine nested mapjoins [Spark Branch]
> -----------------------------------------------------------------
>                 Key: HIVE-8943
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: spark-branch
>            Reporter: Szehon Ho
>            Assignee: Szehon Ho
>         Attachments: HIVE-8943.1-spark.patch, HIVE-8943.1-spark.patch, HIVE-8943.2-spark.patch
> Its the opposite problem of what we thought in HIVE-8701.
> SparkMapJoinOptimizer does combine nested mapjoins into one work due to removal of RS
for big-table.  So we need to enhance the check to calculate if all the MapJoins in that work
(spark-stage) will fit into the memory, otherwise it might overwhelm memory for that particular
spark executor.

This message was sent by Atlassian JIRA

View raw message