spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sasi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-17607) --driver-url doesn't point to my master_ip.
Date Wed, 28 Sep 2016 10:52:21 GMT

    [ https://issues.apache.org/jira/browse/SPARK-17607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15529254#comment-15529254
] 

Sasi commented on SPARK-17607:
------------------------------

Hi,
Any update or thoughts about this issue?
Thanks,
Sasi

> --driver-url doesn't point to my master_ip.
> -------------------------------------------
>
>                 Key: SPARK-17607
>                 URL: https://issues.apache.org/jira/browse/SPARK-17607
>             Project: Spark
>          Issue Type: Bug
>    Affects Versions: 1.5.2
>            Reporter: Sasi
>            Priority: Critical
>
> Hi,
> I have master machine and slave machine.
> My master machine contains 2 interfaces.
> First interface has the following ip 10.5.5.2, and the other interface has the following
ip 10.0.42.230.
> I configured the MASTER_IP to be 10.5.5.2, so once the master goes up and its worker
I see the following INFO lines:
> {code}
> 16/09/20 12:32:32 INFO Worker: Successfully registered with master spark://10.5.5.2:7077
> 16/09/20 12:39:15 INFO Worker: Asked to launch executor app-20160920123915-0000/0 for
Spark-DataAccessor-JBoss
> {code}
> I set the SPARK_LOCAL_IP on each worker to be its own ip, e.g 10.5.5.5.
> Both constants were configured on spark-env.sh.
> The problem started when I tried to get data from my workers.
> I got the following INFO line in each worker log.
> {code} 
> "--driver-url" "akka.tcp://sparkDriver@10.0.42.230:43683/user/CoarseGrainedScheduler"
"
> {code}
> As you can see the masterIp is different then the driver-url ip.
> Master ip is 10.5.5.2 but driver-url is 10.0.42.230, therefore i'm getting the following
errors:
> {code}
> 16/09/20 12:17:57 INFO Slf4jLogger: Slf4jLogger started
> 16/09/20 12:17:57 INFO Remoting: Starting remoting
> 16/09/20 12:17:57 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher@10.5.5.5:34961]
> 16/09/20 12:17:57 INFO Utils: Successfully started service 'driverPropsFetcher' on port
34961.
> 16/09/20 12:19:00 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkDriver@10.0.42.230:36711]
has failed, address is now gated for [5000] ms. Reason: [Association failed with [akka.tcp://sparkDriver@10.0.42.230:36711]]
Caused by: [Connection timed out: /10.0.42.230:36711]
> Exception in thread "main" akka.actor.ActorNotFound: Actor not found for: ActorSelection[Anchor(akka.tcp://sparkDriver@10.0.42.230:36711/),
Path(/user/CoarseGrainedScheduler)]
>         at
> {code}
> {code}
>  "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "akka.tcp://sparkDriver@10.0.42.230:43683/user/CoarseGrainedScheduler"
> {code}
> The master is listen and open for communicate via 10.5.5.2 and not 10.0.42.230.
> Looks like the driver-url ignore the real MASTER_IP.
> Thanks,
> Sasi



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message