spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From pwend...@apache.org
Subject [37/37] git commit: Merge pull request #293 from pwendell/standalone-driver
Date Fri, 10 Jan 2014 02:38:42 GMT
Merge pull request #293 from pwendell/standalone-driver

SPARK-998: Support Launching Driver Inside of Standalone Mode

[NOTE: I need to bring the tests up to date with new changes, so for now they will fail]

This patch provides support for launching driver programs inside of a standalone cluster manager.
It also supports monitoring and re-launching of driver programs which is useful for long running,
recoverable applications such as Spark Streaming jobs. For those jobs, this patch allows a
deployment mode which is resilient to the failure of any worker node, failure of a master
node (provided a multi-master setup), and even failures of the applicaiton itself, provided
they are recoverable on a restart. Driver information, such as the status and logs from a
driver, is displayed in the UI

There are a few small TODO's here, but the code is generally feature-complete. They are:
- Bring tests up to date and add test coverage
- Restarting on failure should be optional and maybe off by default.
- See if we can re-use akka connections to facilitate clients behind a firewall

A sensible place to start for review would be to look at the `DriverClient` class which presents
users the ability to launch their driver program. I've also added an example program (`DriverSubmissionTest`)
that allows you to test this locally and play around with killing workers, etc. Most of the
code is devoted to persisting driver state in the cluster manger, exposing it in the UI, and
dealing correctly with various types of failures.

Instructions to test locally:
- `sbt/sbt assembly/assembly examples/assembly`
- start a local version of the standalone cluster manager

```
./spark-class org.apache.spark.deploy.client.DriverClient \
  -j -Dspark.test.property=something \
  -e SPARK_TEST_KEY=SOMEVALUE \
  launch spark://10.99.1.14:7077 \
  ../path-to-examples-assembly-jar \
  org.apache.spark.examples.DriverSubmissionTest 1000 some extra options --some-option-here
-X 13
```
- Go in the UI and make sure it started correctly, look at the output etc
- Kill workers, the driver program, masters, etc.


Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/d86a85e9
Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/d86a85e9
Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/d86a85e9

Branch: refs/heads/master
Commit: d86a85e9cac09bd909a356de9181bd282c905e72
Parents: 26cdb5f 67b9a33
Author: Patrick Wendell <pwendell@gmail.com>
Authored: Thu Jan 9 18:37:52 2014 -0800
Committer: Patrick Wendell <pwendell@gmail.com>
Committed: Thu Jan 9 18:37:52 2014 -0800

----------------------------------------------------------------------
 core/pom.xml                                    |  10 +
 .../scala/org/apache/spark/deploy/Client.scala  | 151 ++++++++++++
 .../apache/spark/deploy/ClientArguments.scala   | 117 ++++++++++
 .../org/apache/spark/deploy/DeployMessage.scala |  52 ++++-
 .../apache/spark/deploy/DriverDescription.scala |  29 +++
 .../apache/spark/deploy/client/AppClient.scala  | 201 ++++++++++++++++
 .../spark/deploy/client/AppClientListener.scala |  39 ++++
 .../org/apache/spark/deploy/client/Client.scala | 200 ----------------
 .../spark/deploy/client/ClientListener.scala    |  39 ----
 .../apache/spark/deploy/client/TestClient.scala |   4 +-
 .../apache/spark/deploy/master/DriverInfo.scala |  36 +++
 .../spark/deploy/master/DriverState.scala       |  33 +++
 .../master/FileSystemPersistenceEngine.scala    |  17 +-
 .../org/apache/spark/deploy/master/Master.scala | 189 ++++++++++++++-
 .../spark/deploy/master/PersistenceEngine.scala |  11 +-
 .../apache/spark/deploy/master/WorkerInfo.scala |  20 +-
 .../master/ZooKeeperPersistenceEngine.scala     |  14 +-
 .../spark/deploy/master/ui/IndexPage.scala      |  56 ++++-
 .../spark/deploy/worker/CommandUtils.scala      |  63 +++++
 .../spark/deploy/worker/DriverRunner.scala      | 234 +++++++++++++++++++
 .../spark/deploy/worker/DriverWrapper.scala     |  31 +++
 .../spark/deploy/worker/ExecutorRunner.scala    |  67 ++----
 .../org/apache/spark/deploy/worker/Worker.scala |  63 ++++-
 .../spark/deploy/worker/WorkerWatcher.scala     |  55 +++++
 .../spark/deploy/worker/ui/IndexPage.scala      |  65 +++++-
 .../spark/deploy/worker/ui/WorkerWebUI.scala    |  43 ++--
 .../executor/CoarseGrainedExecutorBackend.scala |  27 ++-
 .../cluster/SparkDeploySchedulerBackend.scala   |  10 +-
 .../apache/spark/deploy/JsonProtocolSuite.scala |  40 +++-
 .../spark/deploy/worker/DriverRunnerTest.scala  | 131 +++++++++++
 .../deploy/worker/ExecutorRunnerTest.scala      |   4 +-
 .../deploy/worker/WorkerWatcherSuite.scala      |  32 +++
 docs/spark-standalone.md                        |  38 ++-
 .../spark/examples/DriverSubmissionTest.scala   |  46 ++++
 pom.xml                                         |  17 ++
 project/SparkBuild.scala                        |   1 +
 36 files changed, 1800 insertions(+), 385 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/d86a85e9/core/src/main/scala/org/apache/spark/deploy/master/Master.scala
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/d86a85e9/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/d86a85e9/pom.xml
----------------------------------------------------------------------


Mime
View raw message