Mailing-List: contact dev-help@hive.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@hive.apache.org
Date: Sun, 27 Jul 2014 20:02:39 +0000 (UTC)
From: "Xuefu Zhang (JIRA)" <jira@apache.org>
To: hive-dev@hadoop.apache.org
Message-ID: <JIRA.12730087.1406491348606.52876.1406491359060@arcas>
In-Reply-To: <JIRA.12730087.1406491348606@arcas>
References: <JIRA.12730087.1406491348606@arcas>
Subject: [jira] [Created] (HIVE-7525) Research to find out if it's possible
 to submit Spark jobs concurrently using shared SparkContext
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

Xuefu Zhang created HIVE-7525:
---------------------------------

             Summary: Research to find out if it's possible to submit Spark jobs concurrently using shared SparkContext
                 Key: HIVE-7525
                 URL: https://issues.apache.org/jira/browse/HIVE-7525
             Project: Hive
          Issue Type: Task
          Components: Spark
            Reporter: Xuefu Zhang
            Assignee: Chao


Refer to HIVE-7503 and SPARK-2688. Find out if it's possible to submit multiple spark jobs concurrently using a shared SparkContext. SparkClient's code can be manipulated for this test. Here is the process:

1. Transform rdd1 to rdd2 using some transformation.
2. call rdd2.cache() to persist it in memory.
3. in two threads, calling accordingly:
    Thread a. rdd2 -> rdd3; rdd3.foreach()
    Thread b. rdd2 -> rdd4; rdd4.foreach()

It would be nice to find out monitoring and error reporting aspects.


--
This message was sent by Atlassian JIRA
(v6.2#6252)