hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikas Saha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1029) Allow embedding leader election into the RM
Date Fri, 20 Dec 2013 13:58:25 GMT

    [ https://issues.apache.org/jira/browse/YARN-1029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13853970#comment-13853970

Bikas Saha commented on YARN-1029:

Why fencing configurable when ZK store is self-fenced? I dont think we need to add an fencing
related code for embedded FC except for a dummy fencer to pass into the elector code.
{code}+  public static final String RM_HA_FENCER = RM_HA_PREFIX + "fencer";{code}

Can we please consolidate all zk configs in one place in the file

Isnt rmId enough because the rest of its is available from config. The port is anyways one
of many rm ports.
{code}+  required int32 port = 1;
+  required string hostname = 2;
+  required string clusterid = 3;
+  required string rmId = 4;{code}

There is a separate jira open to add a cluster-id

dropped the synchronized? 
{code}-  private synchronized boolean isRMActive() {{code}

there is no fencer in embedded election, right?
{code}+  @Override
+  public void becomeStandby() {
+    try {
+      rm.transitionToStandby(true);
+    } catch (Exception e) {
+      // Log the exception. The fencer should be able to fence this node
+      LOG.error("RM could not transition to Standby mode", e);
+    }
+  }{code}

this is probably not enough. we need to notify the rm.
+  public void notifyFatalError(String errorMessage) {
+    LOG.fatal("Received " + errorMessage);
+    throw new YarnRuntimeException(errorMessage);
+  }{code}

this should be empty. there is no fencing in embedded election because zk store is self-fenced.
+  public void fenceOldActive(byte[] oldActiveData) {
+    RMHAServiceTarget target = dataToTarget(oldActiveData);
+    try {
+      target.checkFencingConfigured();
+    } catch (BadFencingConfigurationException e) {
+      throw new YarnBadConfigurationException(e.getMessage());
+    }
+    if (!target.getFencer().fence(target)) {
+      throw new YarnRuntimeException("Could not fence old active");
+    }
+  }{code}

Didnt quite get the purpose of the new thread. Why can we not call elector.joinElection()
in serviceStart(). There is no need for us to loop and keep calling joinElection() in a thread.

Use newly created HAUtil helper methods?
+      if (conf.getBoolean(YarnConfiguration.AUTO_FAILOVER_ENABLED,
+          YarnConfiguration.DEFAULT_AUTO_FAILOVER_ENABLED)) {
+        // Automatic failover enabled
+        if (conf.getBoolean(YarnConfiguration.AUTO_FAILOVER_EMBEDDED,
+            YarnConfiguration.DEFAULT_AUTO_FAILOVER_EMBEDDED)) {
+          // Embedded automatic failover enabled
+          electorService = createRMZKActiveStandbyElectorService();
+          addIfService(electorService);

In the embedded failover test how do we know that the ZK based failover is being triggered?
I did not understand how failover can happen so quickly when the zk session timeout is 10s.

IMO the ElectorService should not be calling RM.transitionToActive/Standby. It should be calling
AdminService.transitionToActive/Standby. The AdminService is the only HA entry point into
the system. By calling directly into the RM we are breaking the abstractions that everything
else is going to follow.

Also, an alternative layering would be if the ElectorService would be made a member of the
AdminService. There is no need for the main body of the RM to know about failover or failover
controllers (FC) etc. Interaction with any FC for failover is abstracted in the AdminService.
So IMO if FC is configured to be embedded then we can maintain the abstraction and embed it
into the AdminService. 

> Allow embedding leader election into the RM
> -------------------------------------------
>                 Key: YARN-1029
>                 URL: https://issues.apache.org/jira/browse/YARN-1029
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Karthik Kambatla
>         Attachments: embedded-zkfc-approach.patch, yarn-1029-0.patch, yarn-1029-0.patch,
yarn-1029-1.patch, yarn-1029-approach.patch
> It should be possible to embed common ActiveStandyElector into the RM such that ZooKeeper
based leader election and notification is in-built. In conjunction with a ZK state store,
this configuration will be a simple deployment option.

This message was sent by Atlassian JIRA

View raw message