hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-8840) Print prettier Spark work graph after HIVE-8793 [Spark Branch]
Date Fri, 14 Nov 2014 18:54:34 GMT

    [ https://issues.apache.org/jira/browse/HIVE-8840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14212625#comment-14212625
] 

Xuefu Zhang commented on HIVE-8840:
-----------------------------------

Patch looks good. Originally I thought we need to make changes to ExplainTask, but this is
inituitive and solves the problem. I only have a minor comment on RB.

> Print prettier Spark work graph after HIVE-8793 [Spark Branch]
> --------------------------------------------------------------
>
>                 Key: HIVE-8840
>                 URL: https://issues.apache.org/jira/browse/HIVE-8840
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Xuefu Zhang
>            Assignee: Jimmy Xiang
>             Fix For: spark-branch
>
>         Attachments: HIVE-8840.1-spark.patch
>
>
> Because of HIVE-8793, the work graph for Spark is possibly modified by SplitSparkWorkResolver.
 Original:
> {code}
>     Spark
>       Edges:
>         Reducer 2 <- Map 1 (SORT, 1)
>         Reducer 3 <- Reducer 2 (GROUP, 1)
>          Reducer 4 <- Reducer 2 (GROUP, 1)
> {code}
> New graph
> {code}
>     Spark
>       Edges:
>         Reducer 3 <- Reducer 5 (GROUP, 1)
>         Reducer 4 <- Reducer 6 (GROUP, 1)
>         Reducer 5 <- Map 1 (SORT, 1)
>         Reducer 6 <- Map 1 (SORT, 1)
> {code}
> where Reducer2 was splitted into Reducer5 and Reducer6.
> Two types of ordering can be considered:
> 1. Topological order
> {code}
>     Spark
>       Edges:
>         Reducer 5 <- Map 1 (SORT, 1)
>         Reducer 6 <- Map 1 (SORT, 1)
>         Reducer 3 <- Reducer 5 (GROUP, 1)
>         Reducer 4 <- Reducer 6 (GROUP, 1)
> {code}
> 2.  DFS
> {code}
>     Spark
>       Edges:
>         Reducer 5 <- Map 1 (SORT, 1)
>         Reducer 3 <- Reducer 5 (GROUP, 1)
>         Reducer 6 <- Map 1 (SORT, 1)
>         Reducer 4 <- Reducer 6 (GROUP, 1)
> {code}
> Both seems better, though topolical seems more suitable for a graph. Please feel free
to create a patch on trunk if needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message