hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wilfred Spiegelenburg (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4971) RM fails to re-bind to wildcard IP after failover in multi homed clusters
Date Wed, 20 Apr 2016 14:12:25 GMT

    [ https://issues.apache.org/jira/browse/YARN-4971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249961#comment-15249961

Wilfred Spiegelenburg commented on YARN-4971:

During the service init the service bind address is calculated based on all the settings and
stored as an InetSocketAddress in a local variable. In the startup the server is created using
that socket address. This all works.
In the {{ApplicationMasterService}} and the {{ClientRMService}} after the service start the
bind address is updated as part of the config update. This update does not happen in the other
services. The update is the return value of {{updateConnectAddr()}}. This method in its javadoc
shows the following text:
bq. The wildcard address is replaced with the local host's address.
Since we store this return value in the bind address we break binding to the wildcard address
in later cycles.

> RM fails to re-bind to wildcard IP after failover in multi homed clusters
> -------------------------------------------------------------------------
>                 Key: YARN-4971
>                 URL: https://issues.apache.org/jira/browse/YARN-4971
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 3.0.0
>            Reporter: Wilfred Spiegelenburg
>            Assignee: Wilfred Spiegelenburg
> If the RM has the {{yarn.resourcemanager.bind-host}} set to the first time the
service becomes active binding to the wildcard works as expected. If the service has transitioned
from active to standby and then becomes active again after failovers the service only binds
to one of the ip addresses.
> There is a difference between the services inside the RM: it only seem to happen for
the services listening on ports: 8030 and 8032

This message was sent by Atlassian JIRA

View raw message