hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Szehon Ho (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-8943) Fix memory limit check for combine nested mapjoins [Spark Branch]
Date Fri, 21 Nov 2014 23:56:33 GMT

     [ https://issues.apache.org/jira/browse/HIVE-8943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Szehon Ho updated HIVE-8943:
----------------------------
    Attachment: HIVE-8943.patch

Attaching a patch to add size of connected mapjoin operator in same work (spark-stage) to
the calculation of whether to convert current join to mapjoin.

I added two unit tests to stress this case, but would rather wait for HIVE-8946 as the tests
wont be using mapjoin until then.

> Fix memory limit check for combine nested mapjoins [Spark Branch]
> -----------------------------------------------------------------
>
>                 Key: HIVE-8943
>                 URL: https://issues.apache.org/jira/browse/HIVE-8943
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: spark-branch
>            Reporter: Szehon Ho
>            Assignee: Szehon Ho
>         Attachments: HIVE-8943.patch
>
>
> Its the opposite problem of what we thought in HIVE-8701.
> SparkMapJoinOptimizer actually does combine nested mapjoins into one work due to removal
of RS for big-table.  So we actually need to enhance the check to calculate if all the MapJoins
in that work (spark-stage) will fit into the memory, otherwise it might overwhelm memory for
that particular spark executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message