hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chengxiang Li (JIRA)" <>
Subject [jira] [Updated] (HIVE-11082) Support multi edge between nodes in SparkPlan[Spark Branch]
Date Tue, 14 Jul 2015 06:40:04 GMT


Chengxiang Li updated HIVE-11082:
    Attachment: HIVE-11082.1-spark.patch

SparkPlan support multi edges between nodes in default, just remove the check during SparkPlan::connect.

But self join/union does not benifit from RDD caching with this patch actually, as self join/union
would set different alia names to the source table, which make the ReduceSinkOperators in
different MapWork do not equals with each other.  

> Support multi edge between nodes in SparkPlan[Spark Branch]
> -----------------------------------------------------------
>                 Key: HIVE-11082
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Chengxiang Li
>            Assignee: Chengxiang Li
>         Attachments: HIVE-11082.1-spark.patch
> For Dynamic RDD caching optimization, we found SparkPlan::connect throw exception while
we try to combine 2 works with same child, support multi edge between nodes in SparkPlan would
help to enable dynamic RDD caching in more use cases, like self join and self union.

This message was sent by Atlassian JIRA

View raw message