spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Patrick Wendell (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-5113) Audit and document use of hostnames and IP addresses in Spark
Date Tue, 06 Jan 2015 20:18:34 GMT

     [ https://issues.apache.org/jira/browse/SPARK-5113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Patrick Wendell updated SPARK-5113:
-----------------------------------
    Description: 
Spark has multiple network components that start servers and advertise their network addresses
to other processes.

We should go through each of these components and make sure they have consistent and/or documented
behavior wrt (a) what interface(s) they bind to and (b) what hostname they use to advertise
themselves to other processes. We should document this clearly and explain to people what
to do in different cases (e.g. EC2, dockerized containers, etc).

When Spark initializes, it will search for a network interface until it finds one that is
not a loopback address. Then it will do a reverse DNS lookup for a hostname associated with
that interface. Then the network components will use that hostname to advertise the component
to other processes. That hostname is also the one used for the akka system identifier. In
some cases, that hostname is used as the bind hostname also (e.g. I think this happens in
the connection manager and possibly akka) - which will likely internally result in a re-resolution
of this to an IP address. In other cases (the web UI and netty shuffle) we seem to bind to
all interfaces.

  was:
Spark has multiple network components that start servers and advertise their network addresses
to other processes.

We should go through each of these components and make sure they have consistent and/or documented
behavior wrt (a) what interface(s) they bind to and (b) what hostname they use to advertise
themselves to other processes. We should document this clearly and explain to people what
to do in different cases (e.g. EC2, dockerized containers, etc).

When Spark initializes, it will search for a network interface until it finds one that is
not a loopback address. Then it will do a reverse DNS lookup for a hostname associated with
that interface. Then the network components will use that hostname to advertise the component
to other processes. In some cases, that hostname is used as the bind hostname also (e.g. I
think this happens in the connection manager and possibly akka) - which will likely internally
result in a re-resolution of this to an IP address. In other cases (the web UI and netty shuffle)
we seem to bind to all interfaces.


> Audit and document use of hostnames and IP addresses in Spark
> -------------------------------------------------------------
>
>                 Key: SPARK-5113
>                 URL: https://issues.apache.org/jira/browse/SPARK-5113
>             Project: Spark
>          Issue Type: Bug
>            Reporter: Patrick Wendell
>            Priority: Critical
>
> Spark has multiple network components that start servers and advertise their network
addresses to other processes.
> We should go through each of these components and make sure they have consistent and/or
documented behavior wrt (a) what interface(s) they bind to and (b) what hostname they use
to advertise themselves to other processes. We should document this clearly and explain to
people what to do in different cases (e.g. EC2, dockerized containers, etc).
> When Spark initializes, it will search for a network interface until it finds one that
is not a loopback address. Then it will do a reverse DNS lookup for a hostname associated
with that interface. Then the network components will use that hostname to advertise the component
to other processes. That hostname is also the one used for the akka system identifier. In
some cases, that hostname is used as the bind hostname also (e.g. I think this happens in
the connection manager and possibly akka) - which will likely internally result in a re-resolution
of this to an IP address. In other cases (the web UI and netty shuffle) we seem to bind to
all interfaces.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message