spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "M@" <>
Subject launching a spark cluster in ec2 from within an application
Date Mon, 21 Jul 2014 19:04:12 GMT
I would like to programmatically start a spark cluster in ec2 from another
app running in ec2, run my job and then destroy the cluster.  I can launch a
spark EMR cluster easily enough using the SDK however I ran into two
1) I was only able to retrieve the address of the master node from the
console, not via the SDK.
2) I was not able to connect to the master from my app after setting
"spark://public_dns:7077" as the master in the SparkConf (where public_dns
is the address listed for the cluster on the EMR console page in amazon).  I
kept getting "all masters are unresponsive" errors.
In addition, the amazon docs only speak of running spark jobs in emr by
ssh'ing to the master, launching a spark shell and running the jobs from
there.  Is it even possible to do programmatically from another app or must
you login into the master and run jobs from the shell if you want to use
spark in amazon EMR?

The second approach I tried was simply calling the spark-ec2 script from my
app passing the same parameters that I use to launch the cluster manually
from the cli.  This failed because the ec2.connect call returns None when
called from my app (scala/java on play) whereas it works perfectly when
called from the cli.
Is there a recommended method to launch ec2 clusters dynamically from within
an app running in ec2?

View this message in context:
Sent from the Apache Spark User List mailing list archive at

View raw message