hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1028) Add FailoverProxyProvider like capability to RMProxy
Date Fri, 15 Nov 2013 02:41:21 GMT

    [ https://issues.apache.org/jira/browse/YARN-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13823230#comment-13823230
] 

Karthik Kambatla commented on YARN-1028:
----------------------------------------

bq. How do we know that the standby RM will be ready in that much time (if the current RM
has crashed)?
The StandbyRM might not be ready in a second. However, we should connect as soon as it is
ready. There is no harm in alternating between the RMs until the RM-failover succeeds.

bq. What kind of network errors cause failoverOnNetworkException to be triggered?
There are a bunch of exceptions that result in RETRY_BY_FAILOVER - ConnectException,SocketException
etc.

bq. Can they be caused due to glitches in the network, thus we would fail over to the standby
and failback again?
Yes, the clientNM might try connecting to a different RM during a network glitch. However,
that is different from a RM-failover. The worst thing that can happen when the Client/NM tries
to connect to the Standby and fails to (or, in the future, connects and realizes it is not
Active) is it trys the Active RM on the next attempt.

> Add FailoverProxyProvider like capability to RMProxy
> ----------------------------------------------------
>
>                 Key: YARN-1028
>                 URL: https://issues.apache.org/jira/browse/YARN-1028
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Karthik Kambatla
>         Attachments: yarn-1028-1.patch, yarn-1028-draft-cumulative.patch
>
>
> RMProxy layer currently abstracts RM discovery and implements it by looking up service
information from configuration. Motivated by HDFS and using existing classes from Common,
we can add failover proxy providers that may provide RM discovery in extensible ways.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message