hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chengxiang Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-11276) Optimization around job submission and adding jars [Spark Branch]
Date Fri, 17 Jul 2015 05:23:04 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630791#comment-14630791
] 

Chengxiang Li commented on HIVE-11276:
--------------------------------------

That make sense to me, launch the spark cluster during first query execution would mislead
the user that Hive on Spark is slower than it actually does. Besides, we may also open spark
session while user set hive.execution.engine to spark.

> Optimization around job submission and adding jars [Spark Branch]
> -----------------------------------------------------------------
>
>                 Key: HIVE-11276
>                 URL: https://issues.apache.org/jira/browse/HIVE-11276
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: 1.1.0
>            Reporter: Xuefu Zhang
>            Assignee: Chengxiang Li
>
> It seems that Hive on Spark has some room for performance improvement on job submission.
Specifically, we are calling refreshLocalResources() for every job submission despite there
is are no changes in the jar list. Since Hive on Spark is reusing the containers in the whole
user session, we might be able to optimize that.
> We do need to take into consideration the case of dynamic allocation, in which new executors
might be added.
> This task is some R&D in this area.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message