spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Bourez (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-13317) SPARK_LOCAL_IP does not bind on Slaves
Date Sun, 14 Feb 2016 19:01:18 GMT

    [ https://issues.apache.org/jira/browse/SPARK-13317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15146689#comment-15146689
] 

Christopher Bourez edited comment on SPARK-13317 at 2/14/16 7:00 PM:
---------------------------------------------------------------------

I launch a cluster 
{code}
 ./ec2/spark-ec2 -k sparkclusterkey -i ~/sparkclusterkey.pem --region=eu-west-1 --copy-aws-credentials
--instance-type=m1.large -s 4 --hadoop-major-version=2 launch spark-cluster
{code}
which gives me a master at ec2-54-229-16-73.eu-west-1.compute.amazonaws.com
and slaves at ec2-54-194-99-236.eu-west-1.compute.amazonaws.com etc
If I launch a job in client mode from another network, for example in a Zeppelin notebook
on my macbook, which configuration is equivalent to 
{code}
spark-shell --master=spark://ec2-54-229-16-73.eu-west-1.compute.amazonaws.com:7077
{code}
I see in the logs : 

{code}
16/02/14 19:55:04 INFO AppClient$ClientEndpoint: Executor added: app-20160214185504-0000/0
on worker-20160214185030-172.31.4.179-34425 (172.31.4.179:34425) with 2 cores
16/02/14 19:55:04 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160214185504-0000/0
on hostPort 172.31.4.179:34425 with 2 cores, 1024.0 MB RAM
16/02/14 19:55:04 INFO AppClient$ClientEndpoint: Executor added: app-20160214185504-0000/1
on worker-20160214185030-172.31.4.176-47657 (172.31.4.176:47657) with 2 cores
16/02/14 19:55:04 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160214185504-0000/1
on hostPort 172.31.4.176:47657 with 2 cores, 1024.0 MB RAM
16/02/14 19:55:04 INFO AppClient$ClientEndpoint: Executor added: app-20160214185504-0000/2
on worker-20160214185031-172.31.4.177-41379 (172.31.4.177:41379) with 2 cores
16/02/14 19:55:04 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160214185504-0000/2
on hostPort 172.31.4.177:41379 with 2 cores, 1024.0 MB RAM
16/02/14 19:55:04 INFO AppClient$ClientEndpoint: Executor added: app-20160214185504-0000/3
on worker-20160214185032-172.31.4.178-34353 (172.31.4.178:34353) with 2 cores
16/02/14 19:55:04 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160214185504-0000/3
on hostPort 172.31.4.178:34353 with 2 cores, 1024.0 MB RAM
16/02/14 19:55:04 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.11:64058
with 511.5 MB RAM, BlockManagerId(driver, 192.168.1.11, 64058)
16/02/14 19:55:04 INFO BlockManagerMaster: Registered BlockManager
{code}

which are private IP that my macbook cannot access and when launching a job, an error follow
: 
{code}
16/02/14 19:57:19 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check
your cluster UI to ensure that workers are registered and have sufficient resources
{code}
I tried to connect to the slave, to set SPARK_LOCAL_IP in the slave's spark-env.sh, stop and
restart all slaves from the master, spark master still returns the private IP.


was (Author: christopher5106):
I launch a cluster 
{code}
 ./ec2/spark-ec2 -k sparkclusterkey -i ~/sparkclusterkey.pem --region=eu-west-1 --copy-aws-credentials
--instance-type=m1.large -s 4 --hadoop-major-version=2 launch spark-cluster
{code}
which gives me a master at ec2-54-229-16-73.eu-west-1.compute.amazonaws.com
and slaves at ec2-54-194-99-236.eu-west-1.compute.amazonaws.com etc
If I launch a job in client mode from another network, for example in a Zeppelin notebook
on my macbook, which configuration is equivalent to 
{code}
spark-shell --master=spark://ec2-54-229-16-73.eu-west-1.compute.amazonaws.com:7077
{code}
I see in the logs : 

{code}
16/02/14 19:55:04 INFO AppClient$ClientEndpoint: Executor added: app-20160214185504-0000/0
on worker-20160214185030-172.31.4.179-34425 (172.31.4.179:34425) with 2 cores
16/02/14 19:55:04 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160214185504-0000/0
on hostPort 172.31.4.179:34425 with 2 cores, 1024.0 MB RAM
16/02/14 19:55:04 INFO AppClient$ClientEndpoint: Executor added: app-20160214185504-0000/1
on worker-20160214185030-172.31.4.176-47657 (172.31.4.176:47657) with 2 cores
16/02/14 19:55:04 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160214185504-0000/1
on hostPort 172.31.4.176:47657 with 2 cores, 1024.0 MB RAM
16/02/14 19:55:04 INFO AppClient$ClientEndpoint: Executor added: app-20160214185504-0000/2
on worker-20160214185031-172.31.4.177-41379 (172.31.4.177:41379) with 2 cores
16/02/14 19:55:04 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160214185504-0000/2
on hostPort 172.31.4.177:41379 with 2 cores, 1024.0 MB RAM
16/02/14 19:55:04 INFO AppClient$ClientEndpoint: Executor added: app-20160214185504-0000/3
on worker-20160214185032-172.31.4.178-34353 (172.31.4.178:34353) with 2 cores
16/02/14 19:55:04 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160214185504-0000/3
on hostPort 172.31.4.178:34353 with 2 cores, 1024.0 MB RAM
16/02/14 19:55:04 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.11:64058
with 511.5 MB RAM, BlockManagerId(driver, 192.168.1.11, 64058)
16/02/14 19:55:04 INFO BlockManagerMaster: Registered BlockManager
{code}

which are private IP that my macbook cannot access and when launching a job, an error follow
: 
{code}
16/02/14 19:57:19 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check
your cluster UI to ensure that workers are registered and have sufficient resources
{code}
I tryied to connect to the slave, to set SPARK_LOCAL_IP in the slave's spark-env.sh, stop
and restart all slaves from the master, spark master still returns the private IP.
Thanks,

> SPARK_LOCAL_IP does not bind on Slaves
> --------------------------------------
>
>                 Key: SPARK-13317
>                 URL: https://issues.apache.org/jira/browse/SPARK-13317
>             Project: Spark
>          Issue Type: Bug
>         Environment: Linux EC2, different VPC 
>            Reporter: Christopher Bourez
>
> SPARK_LOCAL_IP does not bind to the provided IP on slaves.
> When launching a job or a spark-shell from a second network, the returned IP for the
slave is still the first IP of the slave. 
> So the job fails with the message : 
> Initial job has not accepted any resources; check your cluster UI to ensure that workers
are registered and have sufficient resources
> It is not a question of resources but the driver which cannot connect to the slave given
the wrong IP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message