hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Szehon Ho (JIRA)" <>
Subject [jira] [Updated] (HIVE-8943) Fix memory limit check for combine nested mapjoins [Spark Branch]
Date Fri, 21 Nov 2014 23:56:33 GMT


Szehon Ho updated HIVE-8943:
    Attachment: HIVE-8943.patch

Attaching a patch to add size of connected mapjoin operator in same work (spark-stage) to
the calculation of whether to convert current join to mapjoin.

I added two unit tests to stress this case, but would rather wait for HIVE-8946 as the tests
wont be using mapjoin until then.

> Fix memory limit check for combine nested mapjoins [Spark Branch]
> -----------------------------------------------------------------
>                 Key: HIVE-8943
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: spark-branch
>            Reporter: Szehon Ho
>            Assignee: Szehon Ho
>         Attachments: HIVE-8943.patch
> Its the opposite problem of what we thought in HIVE-8701.
> SparkMapJoinOptimizer actually does combine nested mapjoins into one work due to removal
of RS for big-table.  So we actually need to enhance the check to calculate if all the MapJoins
in that work (spark-stage) will fit into the memory, otherwise it might overwhelm memory for
that particular spark executor.

This message was sent by Atlassian JIRA

View raw message