mesos-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Schwartzmeyer (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (MESOS-3790) Zk connection should retry on EAI_NONAME
Date Wed, 13 Jun 2018 22:28:00 GMT

     [ https://issues.apache.org/jira/browse/MESOS-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Schwartzmeyer reassigned MESOS-3790:
-------------------------------------------

    Assignee: Andrew Schwartzmeyer

> Zk connection should retry on EAI_NONAME
> ----------------------------------------
>
>                 Key: MESOS-3790
>                 URL: https://issues.apache.org/jira/browse/MESOS-3790
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Neil Conway
>            Assignee: Andrew Schwartzmeyer
>            Priority: Minor
>              Labels: mesosphere, zookeeper
>
> The zookeeper interface is designed to retry (once per second for up to ten minutes)
if one or more of the Zookeeper hostnames can't be resolved (see [MESOS-1326] and [MESOS-1523]).
> However, the current implementation assumes that a DNS resolution failure is indicated
by zookeeper_init() returning NULL and errno being set to EINVAL (Zk translates getaddrinfo()
failures into errno values). However, the current Zk code does:
> {code}
> static int getaddrinfo_errno(int rc) {
>     switch(rc) {
>     case EAI_NONAME:
> // ZOOKEEPER-1323 EAI_NODATA and EAI_ADDRFAMILY are deprecated in FreeBSD.
> #if defined EAI_NODATA && EAI_NODATA != EAI_NONAME
>     case EAI_NODATA:
> #endif
>         return ENOENT;
>     case EAI_MEMORY:
>         return ENOMEM;
>     default:
>         return EINVAL;
>     }
> }
> {code}
> getaddrinfo() returns EAI_NONAME when "the node or service is not known"; per discussion
in [MESOS-2186], this seems to happen intermittently due to DNS failures.
> Proposed fix: looking at errno is always going to be somewhat fragile, but if we're going
to continue doing that, we should check for ENOENT as well as EINVAL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message