hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-9251) SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]
Date Wed, 07 Jan 2015 03:13:34 GMT

    [ https://issues.apache.org/jira/browse/HIVE-9251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267223#comment-14267223
] 

Xuefu Zhang commented on HIVE-9251:
-----------------------------------

Hi Rui, for our unit test, the input size and cluster are all fixed. It shouldn't matter whether
reducer count is exposed in the plan. As to the question of whether or not, we briefly discussed
about this today and we will try to use the same RSC with query execution for explain query.
If this can be nicely shared, it seems okay to have it in the plan. Let me know if I missed
anything. 

> SetSparkReducerParallelism is likely to set too small number of reducers [Spark Branch]
> ---------------------------------------------------------------------------------------
>
>                 Key: HIVE-9251
>                 URL: https://issues.apache.org/jira/browse/HIVE-9251
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Rui Li
>            Assignee: Rui Li
>         Attachments: HIVE-9251.1-spark.patch, HIVE-9251.2-spark.patch
>
>
> This may hurt performance or even lead to task failures. For example, spark's netty-based
shuffle limits the max frame size to be 2G.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message