spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "DUC LIEM NGUYEN (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-22382) Spark on mesos: doesn't support public IP setup for agent and master.
Date Sat, 28 Oct 2017 20:15:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-22382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

DUC LIEM NGUYEN updated SPARK-22382:
------------------------------------
    Affects Version/s:     (was: 2.1.1)
                       2.1.0
          Description: 
I've installed a system as followed:

--mesos master private IP of 10.x.x.2 , Public 35.x.x.6

--mesos slave private IP of 192.x.x.10, Public 111.x.x.2

Now the master assigned the task successfully to the slave, however, the task failed. The
error message is as followed:

{color:#d04437}Exception in thread "main" 17/10/11 22:38:01 ERROR RpcOutboxMessage: Ask timeout
before connecting successfully

Caused by: org.apache.spark.rpc.RpcTimeoutException: Cannot receive any reply in 120 seconds.
This timeout is controlled by spark.rpc.askTimeout
{color}
When I look at the environment, the spark.driver.host points to the private IP address of
the master 10.x.x.2 instead of it public IP address 35.x.x.6. I look at the Wireshark capture
and indeed, there was failed TCP package to the master private IP address.

Now if I set spark.driver.bindAddress from the master to its local IP address, spark.driver.host
from the master to its public IP address, I get the following message.

{color:#d04437}ERROR TaskSchedulerImpl: Lost executor 1 on myhostname.singnet.com.sg: Unable
to create executor due to Cannot assign requested address.{color}

>From my understanding, the spark.driver.bindAddress set it for both master and slave,
hence the slave get the said error. Now I'm really wondering how do I proper setup spark to
work on this clustering over public IP?

  was:
I've installed a system as followed:

--mesos master private IP of 10.x.x.2 , Public 35.x.x.6

--mesos slave private IP of 192.x.x.10, Public 111.x.x.2

Now the master assigned the task successfully to the slave, however, the task failed. The
error message is as followed:

{color:#d04437}{{Exception in thread "main" 17/10/11 22:38:01 ERROR RpcOutboxMessage: Ask
timeout before connecting successfully

Caused by: org.apache.spark.rpc.RpcTimeoutException: Cannot receive any reply in 120 seconds.
This timeout is controlled by spark.rpc.askTimeout
}}{color}
When I look at the environment, the spark.driver.host points to the private IP address of
the master 10.x.x.2 instead of it public IP address 35.x.x.6. I look at the Wireshark capture
and indeed, there was failed TCP package to the master private IP address.

Now if I set spark.driver.bindAddress from the master to its local IP address, spark.driver.host
from the master to its public IP address, I get the following message.

{{ERROR TaskSchedulerImpl: Lost executor 1 on myhostname.singnet.com.sg: Unable to create
executor due to Cannot assign requested address.}}

>From my understanding, the spark.driver.bindAddress set it for both master and slave,
hence the slave get the said error. Now I'm really wondering how do I proper setup spark to
work on this clustering over public IP?


> Spark on mesos: doesn't support public IP setup for agent and master. 
> ----------------------------------------------------------------------
>
>                 Key: SPARK-22382
>                 URL: https://issues.apache.org/jira/browse/SPARK-22382
>             Project: Spark
>          Issue Type: Question
>          Components: Mesos
>    Affects Versions: 2.1.0
>            Reporter: DUC LIEM NGUYEN
>
> I've installed a system as followed:
> --mesos master private IP of 10.x.x.2 , Public 35.x.x.6
> --mesos slave private IP of 192.x.x.10, Public 111.x.x.2
> Now the master assigned the task successfully to the slave, however, the task failed.
The error message is as followed:
> {color:#d04437}Exception in thread "main" 17/10/11 22:38:01 ERROR RpcOutboxMessage: Ask
timeout before connecting successfully
> Caused by: org.apache.spark.rpc.RpcTimeoutException: Cannot receive any reply in 120
seconds. This timeout is controlled by spark.rpc.askTimeout
> {color}
> When I look at the environment, the spark.driver.host points to the private IP address
of the master 10.x.x.2 instead of it public IP address 35.x.x.6. I look at the Wireshark capture
and indeed, there was failed TCP package to the master private IP address.
> Now if I set spark.driver.bindAddress from the master to its local IP address, spark.driver.host
from the master to its public IP address, I get the following message.
> {color:#d04437}ERROR TaskSchedulerImpl: Lost executor 1 on myhostname.singnet.com.sg:
Unable to create executor due to Cannot assign requested address.{color}
> From my understanding, the spark.driver.bindAddress set it for both master and slave,
hence the slave get the said error. Now I'm really wondering how do I proper setup spark to
work on this clustering over public IP?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message