spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject spark git commit: [EC2] [SPARK-6188] Instance types can be mislabeled when re-starting cluster with default arguments
Date Mon, 09 Mar 2015 14:16:13 GMT
Repository: spark
Updated Branches:
  refs/heads/master 55b1b32dc -> f7c799204

[EC2] [SPARK-6188] Instance types can be mislabeled when re-starting cluster with default

As described in and discovered in

When re-starting a cluster, if the user does not provide the instance types, which is the
recommended behavior in the docs currently, the instance will be assigned the default type
m1.large. This then affects the setup of the machines.

This solves this by getting the instance types from the existing instances, and overwriting
the default options.

EDIT: Further clarification of the issue:

In short, while the instances themselves are the same as launched, their setup is done assuming
the default instance type, m1.large.

This means that the machines are assumed to have 2 disks, and that leads to problems that
are described in in issue [5838](, where
machines that have one disk end up having shuffle spills in the in the small (8GB) snapshot
partitions that quickly fills up and results in failing jobs due to "No space left on device"

Other instance specific settings that are set in the script are likely to be
wrong as well.

Author: Theodore Vasiloudis <>
Author: Theodore Vasiloudis <>

Closes #4916 from thvasilo/SPARK-6188]-Instance-types-can-be-mislabeled-when-re-starting-cluster-with-default-arguments
and squashes the following commits:

6705b98 [Theodore Vasiloudis] Added comment to clarify setting master instance type to the
empty string.
a3d29fe [Theodore Vasiloudis] More trailing whitespace
7b32429 [Theodore Vasiloudis] Removed trailing whitespace
3ebd52a [Theodore Vasiloudis] Make sure that the instance type is correct when relaunching
a cluster.


Branch: refs/heads/master
Commit: f7c799204358bcc38c5972a29e5994b78b25b515
Parents: 55b1b32
Author: Theodore Vasiloudis <>
Authored: Mon Mar 9 14:16:07 2015 +0000
Committer: Sean Owen <>
Committed: Mon Mar 9 14:16:07 2015 +0000

 ec2/ | 11 +++++++++++
 1 file changed, 11 insertions(+)
diff --git a/ec2/ b/ec2/
index 5e636dd..b50b381 100755
--- a/ec2/
+++ b/ec2/
@@ -1307,6 +1307,17 @@ def real_main():
             cluster_instances=(master_nodes + slave_nodes),
+        # Determine types of running instances
+        existing_master_type = master_nodes[0].instance_type
+        existing_slave_type = slave_nodes[0].instance_type
+        # Setting opts.master_instance_type to the empty string indicates we
+        # have the same instance type for the master and the slaves
+        if existing_master_type == existing_slave_type:
+            existing_master_type = ""
+        opts.master_instance_type = existing_master_type
+        opts.instance_type = existing_slave_type
         setup_cluster(conn, master_nodes, slave_nodes, opts, False)

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message