hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <>
Subject [jira] [Commented] (HIVE-7382) Create a MiniSparkCluster and set up a testing framework [Spark Branch]
Date Thu, 25 Sep 2014 04:16:33 GMT


Xuefu Zhang commented on HIVE-7382:

Hi [~lirui], yes, we'd like to use Spark local-cluster to back a mini cluster when running
tests because it's closer to a real cluster and easy to start. I know it's for Spark internal
use, but for test we shoiuld be okay. Especially it's easy to switch to local if we have to.
Such a mini cluster resembles more to a mr minicluster. It also easy for us to control  the
number of works, executor per node, memory, and so on. Thus, I think this is a nice thing
to have. Thanks for researching into this area.

When I did the POC, local-cluster actually worked, of course after resolving a few library
conflicts. We might have the similar problems with the current code base.

> Create a MiniSparkCluster and set up a testing framework [Spark Branch]
> -----------------------------------------------------------------------
>                 Key: HIVE-7382
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Xuefu Zhang
>            Assignee: Rui Li
>              Labels: Spark-M1
> To automatically test Hive functionality over Spark execution engine, we need to create
a test framework that can execute Hive queries with Spark as the backend. For that, we should
create a MiniSparkCluser for this, similar to other execution engines.
> Spark has a way to create a local cluster with a few processes in the local machine,
each process is a work node. It's fairly close to a real Spark cluster. Our mini cluster can
be based on that.
> For more info, please refer to the design doc on wiki.

This message was sent by Atlassian JIRA

View raw message