mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Avinash Sridharan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MESOS-5325) Mesos can't determine if task IP is reachable
Date Wed, 04 May 2016 21:51:12 GMT

    [ https://issues.apache.org/jira/browse/MESOS-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271498#comment-15271498
] 

Avinash Sridharan commented on MESOS-5325:
------------------------------------------

[~djosborne] Mesos cannot determine if an IP address allocated to the container is routeable
from other containers or not. I do agree that this is an issue with ip-per-container in general,
 but this problem needs to be solved at the service discovery layer (potentially MesosDNS).
The service discovery module needs to be able to resolve the name to a routeable IP address
based on where the query for DNS resolution originated. Effectively the service discovery
layer needs to build a split horizon of its view of the network.

> Mesos can't determine if task IP is reachable
> ---------------------------------------------
>
>                 Key: MESOS-5325
>                 URL: https://issues.apache.org/jira/browse/MESOS-5325
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Dan Osborne
>
> I have uncovered a design flaw that affects ip-per-container tasks when run in a cluster
alongside non ip-per-container tasks. This affects docker-libnetwork, netmodules, and I suspect
it will also affect CNI.
> After Mesos launches a docker bridge task, it fills the task's networkinfo field with
the docker bridge IP assigned to that task. Because of this behavior, when a launched task's
NetworkInfo is later utilized by Mesos components, it is unknown if it is filled with an IP
address accessible throughout the cluster, or if it is not.
> A common use case where this is a problem can be encountered when using Mesos DNS. Mesos-DNS
has a configuration setting that tells it which information to respond to a query with: NetworkInfo,
or HostIP. If it has been configured to prefer NetworkInfo, it correctly resolves ip-per-container
containers to their unique IP. But, because the docker bridge IP is also stored in NetworkInfo,
it will incorrectly resolve docker-bridge containers to an IP address not accessible from
anywhere besides the slave they are on. This breaks DNS resolutions in Mesos.
> I believe Mesos needs a way to distinguish between tasks which are accessible via their
IP and tasks that are not.
> One fix would be to prevent Mesos from filling in NetworkInfo for a task if it is known
that the task is not reachable throughout the cluster via that address. Essentially, NetworkInfo
could be interpreted as a boolean - Its presence means this task is addressable. Its absence
means the task is not. In practice, this would mean it gets filled in for CNI tasks, netmodules
tasks, and docker tasks bound to the host networking namespace. It would not get filled in
for docker bridge tasks.
> I believe this change would be fairly minimum in scope. To implement it,  Mesos would
need to be changed to not store Docker Bridge IP's in NetworkInfo.
> I'm also open to discussion and other suggestions on how to resolve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message