hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1481) ResourceManager and AdminService interact in a convoluted manner after YARN-1318
Date Mon, 09 Dec 2013 00:27:07 GMT

    [ https://issues.apache.org/jira/browse/YARN-1481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842810#comment-13842810

Karthik Kambatla commented on YARN-1481:

Thanks [~vinodkv] for the refactoring. The code reads better now.

As for the TODO items in the patch:
# HAServiceStatus returned by getServiceStatus() has an isReadyToBecomeActive method which
is used by the FailoverController to check if an RM is in a state to become Active. This corresponds
to Active and Standby states, as the RM can not be made Active when it is in the INITIALIZING
or STOPPING states. If I have missed the question entirely, please let me know.
+    if (isRMActive() || haState == HAServiceProtocol.HAServiceState.STANDBY) { // TODO: What
# Previously, AdminService was altering the Configuration and Bikas had suggested we set the
RM's conf in case any one decides to make a copy of the Configuration in AdminService. Long
story short, it can be removed.
+    if (this.rmContext.isHAEnabled()) {
+      HAUtil.verifyAndSetConfiguration(conf);
+      setConf(conf); // TODO: Why is this needed? Remove.
+    }
# I am not sure why we need a lock there in the first place. It should be safe to remove it
here, as transitionToStandby itself is synchronized.
+        synchronized(rm) { // TODO: Lock which class?
+          if (rmContext.isHAEnabled()) {
             try {
               // Transition to standby and reinit active services
               LOG.info("Transitioning RM to Standby mode");
-              adminService.transitionToStandby(true);
+              rm.transitionToStandby(true);

Comments on the patch itself: 
# The field haState in ResourceManager can be private. Looks like we forgot to make it private
along with other changes.
# {{RMContextImpl#setHAServiceState}} should be called everywhere haState is set in the RM,
AdminService depends on this. Looks like the test failures are also because of this. 
# Given RMContext has the haState, it might be better to remove it altogether from ResourceManager
and use the rmContext version.
# get and set HAState in RMContextImpl should be synchronized on haState before setting or
getting haState

> ResourceManager and AdminService interact in a convoluted manner after YARN-1318
> --------------------------------------------------------------------------------
>                 Key: YARN-1481
>                 URL: https://issues.apache.org/jira/browse/YARN-1481
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Vinod Kumar Vavilapalli
>         Attachments: YARN-1481-20131207.txt
> This is something I found while reviewing YARN-1318, but didn't halt that patch as many
cycles went there already. Some top level issues
>  - Not easy to follow RM's service life cycle
>     -- RM adds only AdminService as its service directly.
>     -- Other services are added to RM when AdminService's init calls RM.activeServices.init()
>  - Overall, AdminService shouldn't encompass all of RM's HA state management. It was
originally supposed to be the implementation of just the RPC server.

This message was sent by Atlassian JIRA

View raw message