Return-Path: X-Original-To: apmail-hadoop-yarn-commits-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 78D28118D3 for ; Wed, 11 Jun 2014 12:21:18 +0000 (UTC) Received: (qmail 54023 invoked by uid 500); 11 Jun 2014 12:21:18 -0000 Delivered-To: apmail-hadoop-yarn-commits-archive@hadoop.apache.org Received: (qmail 53982 invoked by uid 500); 11 Jun 2014 12:21:18 -0000 Mailing-List: contact yarn-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-commits@hadoop.apache.org Delivered-To: mailing list yarn-commits@hadoop.apache.org Received: (qmail 53968 invoked by uid 99); 11 Jun 2014 12:21:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Jun 2014 12:21:18 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Jun 2014 12:21:14 +0000 Received: from eris.apache.org (localhost [127.0.0.1]) by eris.apache.org (Postfix) with ESMTP id 9EEDF2388AA9; Wed, 11 Jun 2014 12:20:53 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r1601869 [1/3] - in /hadoop/common/branches/HDFS-5442/hadoop-yarn-project: ./ hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/ hadoop-yarn/h... Date: Wed, 11 Jun 2014 12:20:51 -0000 To: yarn-commits@hadoop.apache.org From: vinayakumarb@apache.org X-Mailer: svnmailer-1.0.9 Message-Id: <20140611122053.9EEDF2388AA9@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: vinayakumarb Date: Wed Jun 11 12:20:48 2014 New Revision: 1601869 URL: http://svn.apache.org/r1601869 Log: Merged revision(s) 1601144-1601868, 1598456-1601149 from hadoop/common/trunk Added: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainerRecoverEvent.java - copied unchanged from r1601868, hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainerRecoverEvent.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeStartedEvent.java - copied unchanged from r1601868, hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeStartedEvent.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java - copied unchanged from r1601868, hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestWorkPreservingRMRestart.java Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/CHANGES.txt hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationResourceUsageReport.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerExitStatus.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/ApplicationImpl.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerKillEvent.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainersMonitor.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MemoryRMStateStore.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/NullRMStateStore.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/ApplicationAttemptStateData.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/ApplicationStateData.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/impl/pb/ApplicationAttemptStateDataPBImpl.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/impl/pb/ApplicationStateDataPBImpl.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainerEventType.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainerImpl.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/Queue.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerNode.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerNode.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/NodeAddedSchedulerEvent.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AppSchedulable.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSParentQueue.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestFifoScheduler.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestMoveApplication.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/TestFSRMStateStore.java hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/TestRMAppTransitions.java Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/CHANGES.txt URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/CHANGES.txt?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/CHANGES.txt (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/CHANGES.txt Wed Jun 11 12:20:48 2014 @@ -32,6 +32,10 @@ Release 2.5.0 - UNRELEASED YARN-1338. Recover localized resource cache state upon nodemanager restart (Jason Lowe via junping_du) + YARN-1368. Added core functionality of recovering container state into + schedulers after ResourceManager Restart so as to preserve running work in + the cluster. (Jian He via vinodkv) + IMPROVEMENTS YARN-1479. Invalid NaN values in Hadoop REST API JSON response (Chen He via @@ -145,6 +149,15 @@ Release 2.5.0 - UNRELEASED YARN-2132. ZKRMStateStore.ZKAction#runWithRetries doesn't log the exception it encounters. (Vamsee Yarlagadda via kasha) + YARN-2030. Augmented RMStateStore with state machine.(Binglin Chang via jianhe) + + YARN-1424. RMAppAttemptImpl should return the + DummyApplicationResourceUsageReport for all invalid accesses. + (Ray Chiang via kasha) + + YARN-2091. Add more values to ContainerExitStatus and pass it from NM to + RM and then to app masters (Tsuyoshi OZAWA via bikas) + OPTIMIZATIONS BUG FIXES Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationResourceUsageReport.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationResourceUsageReport.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationResourceUsageReport.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationResourceUsageReport.java Wed Jun 11 12:20:48 2014 @@ -47,7 +47,7 @@ public abstract class ApplicationResourc } /** - * Get the number of used containers + * Get the number of used containers. -1 for invalid/inaccessible reports. * @return the number of used containers */ @Public @@ -63,7 +63,7 @@ public abstract class ApplicationResourc public abstract void setNumUsedContainers(int num_containers); /** - * Get the number of reserved containers + * Get the number of reserved containers. -1 for invalid/inaccessible reports. * @return the number of reserved containers */ @Private @@ -79,7 +79,7 @@ public abstract class ApplicationResourc public abstract void setNumReservedContainers(int num_reserved_containers); /** - * Get the used Resource + * Get the used Resource. -1 for invalid/inaccessible reports. * @return the used Resource */ @Public @@ -91,7 +91,7 @@ public abstract class ApplicationResourc public abstract void setUsedResources(Resource resources); /** - * Get the reserved Resource + * Get the reserved Resource. -1 for invalid/inaccessible reports. * @return the reserved Resource */ @Public @@ -103,7 +103,7 @@ public abstract class ApplicationResourc public abstract void setReservedResources(Resource reserved_resources); /** - * Get the needed Resource + * Get the needed Resource. -1 for invalid/inaccessible reports. * @return the needed Resource */ @Public Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerExitStatus.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerExitStatus.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerExitStatus.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerExitStatus.java Wed Jun 11 12:20:48 2014 @@ -46,4 +46,30 @@ public class ContainerExitStatus { * Containers preempted by the framework. */ public static final int PREEMPTED = -102; + + /** + * Container terminated because of exceeding allocated virtual memory. + */ + public static final int KILLED_EXCEEDED_VMEM = -103; + + /** + * Container terminated because of exceeding allocated physical memory. + */ + public static final int KILLED_EXCEEDED_PMEM = -104; + + /** + * Container was terminated by stop request by the app master. + */ + public static final int KILLED_BY_APPMASTER = -105; + + /** + * Container was terminated by the resource manager. + */ + public static final int KILLED_BY_RESOURCEMANAGER = -106; + + /** + * Container was terminated after the application finished. + */ + public static final int KILLED_AFTER_APP_COMPLETION = -107; + } Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java Wed Jun 11 12:20:48 2014 @@ -318,6 +318,13 @@ public class YarnConfiguration extends C public static final String RECOVERY_ENABLED = RM_PREFIX + "recovery.enabled"; public static final boolean DEFAULT_RM_RECOVERY_ENABLED = false; + @Private + public static final String RM_WORK_PRESERVING_RECOVERY_ENABLED = RM_PREFIX + + "work-preserving-recovery.enabled"; + @Private + public static final boolean DEFAULT_RM_WORK_PRESERVING_RECOVERY_ENABLED = + false; + /** Zookeeper interaction configs */ public static final String RM_ZK_PREFIX = RM_PREFIX + "zk-"; Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml Wed Jun 11 12:20:48 2014 @@ -270,6 +270,14 @@ + Enable RM work preserving recovery. This configuration is private + to YARN for experimenting the feature. + + yarn.resourcemanager.work-preserving-recovery.enabled + false + + + The class to use as the persistent store. If org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java Wed Jun 11 12:20:48 2014 @@ -64,6 +64,7 @@ import org.apache.hadoop.yarn.api.protoc import org.apache.hadoop.yarn.api.protocolrecords.StopContainersRequest; import org.apache.hadoop.yarn.api.protocolrecords.StopContainersResponse; import org.apache.hadoop.yarn.api.records.ApplicationId; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; import org.apache.hadoop.yarn.api.records.ContainerId; import org.apache.hadoop.yarn.api.records.ContainerLaunchContext; import org.apache.hadoop.yarn.api.records.ContainerState; @@ -738,7 +739,8 @@ public class ContainerManagerImpl extend } else { dispatcher.getEventHandler().handle( new ContainerKillEvent(containerID, - "Container killed by the ApplicationMaster.")); + ContainerExitStatus.KILLED_BY_APPMASTER, + "Container killed by the ApplicationMaster.")); NMAuditLogger.logSuccess(container.getUser(), AuditConstants.STOP_CONTAINER, "ContainerManageImpl", containerID @@ -887,6 +889,7 @@ public class ContainerManagerImpl extend .getContainersToCleanup()) { this.dispatcher.getEventHandler().handle( new ContainerKillEvent(container, + ContainerExitStatus.KILLED_BY_RESOURCEMANAGER, "Container Killed by ResourceManager")); } break; Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/ApplicationImpl.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/ApplicationImpl.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/ApplicationImpl.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/application/ApplicationImpl.java Wed Jun 11 12:20:48 2014 @@ -30,6 +30,7 @@ import org.apache.commons.logging.LogFac import org.apache.hadoop.security.Credentials; import org.apache.hadoop.yarn.api.records.ApplicationAccessType; import org.apache.hadoop.yarn.api.records.ApplicationId; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; import org.apache.hadoop.yarn.api.records.ContainerId; import org.apache.hadoop.yarn.event.Dispatcher; import org.apache.hadoop.yarn.logaggregation.ContainerLogsRetentionPolicy; @@ -375,6 +376,7 @@ public class ApplicationImpl implements for (ContainerId containerID : app.containers.keySet()) { app.dispatcher.getEventHandler().handle( new ContainerKillEvent(containerID, + ContainerExitStatus.KILLED_AFTER_APP_COMPLETION, "Container killed on application-finish event: " + appEvent.getDiagnostic())); } return ApplicationState.FINISHING_CONTAINERS_WAIT; Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java Wed Jun 11 12:20:48 2014 @@ -48,7 +48,6 @@ import org.apache.hadoop.yarn.event.Disp import org.apache.hadoop.yarn.event.EventHandler; import org.apache.hadoop.yarn.security.ContainerTokenIdentifier; import org.apache.hadoop.yarn.server.api.protocolrecords.NMContainerStatus; -import org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor.ExitCode; import org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger; import org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger.AuditConstants; import org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServicesEvent; @@ -773,7 +772,7 @@ public class ContainerImpl implements Co container.cleanup(); container.metrics.endInitingContainer(); ContainerKillEvent killEvent = (ContainerKillEvent) event; - container.exitCode = ExitCode.TERMINATED.getExitCode(); + container.exitCode = killEvent.getContainerExitStatus(); container.diagnostics.append(killEvent.getDiagnostic()).append("\n"); container.diagnostics.append("Container is killed before being launched.\n"); } @@ -817,6 +816,7 @@ public class ContainerImpl implements Co ContainersLauncherEventType.CLEANUP_CONTAINER)); ContainerKillEvent killEvent = (ContainerKillEvent) event; container.diagnostics.append(killEvent.getDiagnostic()).append("\n"); + container.exitCode = killEvent.getContainerExitStatus(); } } @@ -829,7 +829,10 @@ public class ContainerImpl implements Co @Override public void transition(ContainerImpl container, ContainerEvent event) { ContainerExitEvent exitEvent = (ContainerExitEvent) event; - container.exitCode = exitEvent.getExitCode(); + if (container.hasDefaultExitCode()) { + container.exitCode = exitEvent.getExitCode(); + } + if (exitEvent.getDiagnosticInfo() != null) { container.diagnostics.append(exitEvent.getDiagnosticInfo()) .append('\n'); @@ -871,7 +874,7 @@ public class ContainerImpl implements Co @Override public void transition(ContainerImpl container, ContainerEvent event) { ContainerKillEvent killEvent = (ContainerKillEvent) event; - container.exitCode = ExitCode.TERMINATED.getExitCode(); + container.exitCode = killEvent.getContainerExitStatus(); container.diagnostics.append(killEvent.getDiagnostic()).append("\n"); container.diagnostics.append("Container is killed before being launched.\n"); super.transition(container, event); @@ -928,4 +931,9 @@ public class ContainerImpl implements Co this.readLock.unlock(); } } + + private boolean hasDefaultExitCode() { + return (this.exitCode == ContainerExitStatus.INVALID); + } + } Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerKillEvent.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerKillEvent.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerKillEvent.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerKillEvent.java Wed Jun 11 12:20:48 2014 @@ -23,13 +23,21 @@ import org.apache.hadoop.yarn.api.record public class ContainerKillEvent extends ContainerEvent { private final String diagnostic; + private final int exitStatus; - public ContainerKillEvent(ContainerId cID, String diagnostic) { + public ContainerKillEvent(ContainerId cID, + int exitStatus, String diagnostic) { super(cID, ContainerEventType.KILL_CONTAINER); + this.exitStatus = exitStatus; this.diagnostic = diagnostic; } public String getDiagnostic() { return this.diagnostic; } + + public int getContainerExitStatus() { + return this.exitStatus; + } + } Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java Wed Jun 11 12:20:48 2014 @@ -30,6 +30,7 @@ import org.apache.commons.logging.LogFac import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.service.AbstractService; import org.apache.hadoop.util.StringUtils.TraditionalBinaryPrefix; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; import org.apache.hadoop.yarn.api.records.ContainerId; import org.apache.hadoop.yarn.conf.YarnConfiguration; import org.apache.hadoop.yarn.event.AsyncDispatcher; @@ -403,6 +404,7 @@ public class ContainersMonitorImpl exten boolean isMemoryOverLimit = false; String msg = ""; + int containerExitStatus = ContainerExitStatus.INVALID; if (isVmemCheckEnabled() && isProcessTreeOverLimit(containerId.toString(), currentVmemUsage, curMemUsageOfAgedProcesses, vmemLimit)) { @@ -414,6 +416,7 @@ public class ContainersMonitorImpl exten currentPmemUsage, pmemLimit, pId, containerId, pTree); isMemoryOverLimit = true; + containerExitStatus = ContainerExitStatus.KILLED_EXCEEDED_VMEM; } else if (isPmemCheckEnabled() && isProcessTreeOverLimit(containerId.toString(), currentPmemUsage, curRssMemUsageOfAgedProcesses, @@ -426,6 +429,7 @@ public class ContainersMonitorImpl exten currentPmemUsage, pmemLimit, pId, containerId, pTree); isMemoryOverLimit = true; + containerExitStatus = ContainerExitStatus.KILLED_EXCEEDED_PMEM; } if (isMemoryOverLimit) { @@ -440,7 +444,8 @@ public class ContainersMonitorImpl exten } // kill the container eventDispatcher.getEventHandler().handle( - new ContainerKillEvent(containerId, msg)); + new ContainerKillEvent(containerId, + containerExitStatus, msg)); it.remove(); LOG.info("Removed ProcessTree with root " + pId); } else { Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java Wed Jun 11 12:20:48 2014 @@ -31,6 +31,7 @@ import java.util.HashMap; import java.util.List; import java.util.Map; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; import org.junit.Assert; import org.apache.commons.logging.LogFactory; @@ -68,7 +69,6 @@ import org.apache.hadoop.yarn.security.C import org.apache.hadoop.yarn.security.NMTokenIdentifier; import org.apache.hadoop.yarn.server.api.ResourceManagerConstants; import org.apache.hadoop.yarn.server.nodemanager.CMgrCompletedAppsEvent; -import org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor.ExitCode; import org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor; import org.apache.hadoop.yarn.server.nodemanager.DeletionService; import org.apache.hadoop.yarn.server.nodemanager.containermanager.TestAuxServices.ServiceA; @@ -348,8 +348,7 @@ public class TestContainerManager extend GetContainerStatusesRequest.newInstance(containerIds); ContainerStatus containerStatus = containerManager.getContainerStatuses(gcsRequest).getContainerStatuses().get(0); - int expectedExitCode = Shell.WINDOWS ? ExitCode.FORCE_KILLED.getExitCode() : - ExitCode.TERMINATED.getExitCode(); + int expectedExitCode = ContainerExitStatus.KILLED_BY_APPMASTER; Assert.assertEquals(expectedExitCode, containerStatus.getExitStatus()); // Assert that the process is not alive anymore Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java Wed Jun 11 12:20:48 2014 @@ -17,6 +17,7 @@ */ package org.apache.hadoop.yarn.server.nodemanager.containermanager.container; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; import static org.junit.Assert.assertEquals; import static org.junit.Assert.assertNotNull; import static org.junit.Assert.assertNull; @@ -319,7 +320,7 @@ public class TestContainer { assertEquals(ContainerState.NEW, wc.c.getContainerState()); wc.killContainer(); assertEquals(ContainerState.DONE, wc.c.getContainerState()); - assertEquals(ExitCode.TERMINATED.getExitCode(), + assertEquals(ContainerExitStatus.KILLED_BY_RESOURCEMANAGER, wc.c.cloneAndGetContainerStatus().getExitStatus()); assertTrue(wc.c.cloneAndGetContainerStatus().getDiagnostics() .contains("KillRequest")); @@ -339,7 +340,7 @@ public class TestContainer { assertEquals(ContainerState.LOCALIZING, wc.c.getContainerState()); wc.killContainer(); assertEquals(ContainerState.KILLING, wc.c.getContainerState()); - assertEquals(ExitCode.TERMINATED.getExitCode(), + assertEquals(ContainerExitStatus.KILLED_BY_RESOURCEMANAGER, wc.c.cloneAndGetContainerStatus().getExitStatus()); assertTrue(wc.c.cloneAndGetContainerStatus().getDiagnostics() .contains("KillRequest")); @@ -898,12 +899,14 @@ public class TestContainer { } public void killContainer() { - c.handle(new ContainerKillEvent(cId, "KillRequest")); + c.handle(new ContainerKillEvent(cId, + ContainerExitStatus.KILLED_BY_RESOURCEMANAGER, + "KillRequest")); drainDispatcherEvents(); } public void containerKilledOnRequest() { - int exitCode = ExitCode.FORCE_KILLED.getExitCode(); + int exitCode = ContainerExitStatus.KILLED_BY_RESOURCEMANAGER; String diagnosticMsg = "Container completed with exit code " + exitCode; c.handle(new ContainerExitEvent(cId, ContainerEventType.CONTAINER_KILLED_ON_REQUEST, exitCode, Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java Wed Jun 11 12:20:48 2014 @@ -18,6 +18,7 @@ package org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; import static org.junit.Assert.assertEquals; import static org.junit.Assert.assertThat; import static org.junit.Assert.fail; @@ -73,7 +74,6 @@ import org.apache.hadoop.yarn.event.Disp import org.apache.hadoop.yarn.event.Event; import org.apache.hadoop.yarn.event.EventHandler; import org.apache.hadoop.yarn.security.ContainerTokenIdentifier; -import org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor.ExitCode; import org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor; import org.apache.hadoop.yarn.server.nodemanager.containermanager.BaseContainerManagerTest; import org.apache.hadoop.yarn.server.nodemanager.containermanager.container.Container; @@ -604,8 +604,7 @@ public class TestContainerLaunch extends GetContainerStatusesRequest.newInstance(containerIds); ContainerStatus containerStatus = containerManager.getContainerStatuses(gcsRequest).getContainerStatuses().get(0); - int expectedExitCode = Shell.WINDOWS ? ExitCode.FORCE_KILLED.getExitCode() : - ExitCode.TERMINATED.getExitCode(); + int expectedExitCode = ContainerExitStatus.KILLED_BY_APPMASTER; Assert.assertEquals(expectedExitCode, containerStatus.getExitStatus()); // Assert that the process is not alive anymore @@ -717,7 +716,7 @@ public class TestContainerLaunch extends ContainerStatus containerStatus = containerManager.getContainerStatuses(gcsRequest) .getContainerStatuses().get(0); - Assert.assertEquals(ExitCode.FORCE_KILLED.getExitCode(), + Assert.assertEquals(ContainerExitStatus.KILLED_BY_APPMASTER, containerStatus.getExitStatus()); // Now verify the contents of the file. Script generates a message when it Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainersMonitor.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainersMonitor.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainersMonitor.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainersMonitor.java Wed Jun 11 12:20:48 2014 @@ -18,6 +18,7 @@ package org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor; +import org.apache.hadoop.yarn.api.records.ContainerExitStatus; import static org.junit.Assert.assertEquals; import static org.junit.Assert.assertFalse; import static org.junit.Assert.assertTrue; @@ -60,7 +61,6 @@ import org.apache.hadoop.yarn.event.Asyn import org.apache.hadoop.yarn.exceptions.YarnException; import org.apache.hadoop.yarn.security.ContainerTokenIdentifier; import org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor; -import org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor.ExitCode; import org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor.Signal; import org.apache.hadoop.yarn.server.nodemanager.Context; import org.apache.hadoop.yarn.server.nodemanager.containermanager.BaseContainerManagerTest; @@ -270,7 +270,7 @@ public class TestContainersMonitor exten GetContainerStatusesRequest.newInstance(containerIds); ContainerStatus containerStatus = containerManager.getContainerStatuses(gcsRequest).getContainerStatuses().get(0); - Assert.assertEquals(ExitCode.TERMINATED.getExitCode(), + Assert.assertEquals(ContainerExitStatus.KILLED_EXCEEDED_VMEM, containerStatus.getExitStatus()); String expectedMsgPattern = "Container \\[pid=" + pid + ",containerID=" + cId Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContext.java Wed Jun 11 12:20:48 2014 @@ -99,4 +99,6 @@ public interface RMContext { RMApplicationHistoryWriter rmApplicationHistoryWriter); ConfigurationProvider getConfigurationProvider(); + + boolean isWorkPreservingRecoveryEnabled(); } \ No newline at end of file Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMContextImpl.java Wed Jun 11 12:20:48 2014 @@ -60,6 +60,7 @@ public class RMContextImpl implements RM = new ConcurrentHashMap(); private boolean isHAEnabled; + private boolean isWorkPreservingRecoveryEnabled; private HAServiceState haServiceState = HAServiceProtocol.HAServiceState.INITIALIZING; @@ -329,6 +330,15 @@ public class RMContextImpl implements RM } } + public void setWorkPreservingRecoveryEnabled(boolean enabled) { + this.isWorkPreservingRecoveryEnabled = enabled; + } + + @Override + public boolean isWorkPreservingRecoveryEnabled() { + return this.isWorkPreservingRecoveryEnabled; + } + @Override public RMApplicationHistoryWriter getRMApplicationHistoryWriter() { return rmApplicationHistoryWriter; Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java Wed Jun 11 12:20:48 2014 @@ -28,6 +28,7 @@ import org.apache.hadoop.security.Access import org.apache.hadoop.security.UserGroupInformation; import org.apache.hadoop.security.authorize.AccessControlList; import org.apache.hadoop.yarn.api.records.ApplicationAttemptId; +import org.apache.hadoop.yarn.api.records.ApplicationResourceUsageReport; import org.apache.hadoop.yarn.api.records.ContainerId; import org.apache.hadoop.yarn.api.records.NodeState; import org.apache.hadoop.yarn.api.records.Resource; @@ -43,6 +44,8 @@ import org.apache.hadoop.yarn.server.res import org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptState; import org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNode; import org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils; +import org.apache.hadoop.yarn.server.utils.BuilderUtils; +import org.apache.hadoop.yarn.util.resource.Resources; /** * Utility methods to aid serving RM data through the REST and RPC APIs @@ -225,4 +228,13 @@ public class RMServerUtils { } } + /** + * Statically defined dummy ApplicationResourceUsageREport. Used as + * a return value when a valid report cannot be found. + */ + public static final ApplicationResourceUsageReport + DUMMY_APPLICATION_RESOURCE_USAGE_REPORT = + BuilderUtils.newApplicationResourceUsageReport(-1, -1, + Resources.createResource(-1, -1), Resources.createResource(-1, -1), + Resources.createResource(-1, -1)); } Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java Wed Jun 11 12:20:48 2014 @@ -364,9 +364,15 @@ public class ResourceManager extends Com YarnConfiguration.DEFAULT_RM_RECOVERY_ENABLED); RMStateStore rmStore = null; - if(isRecoveryEnabled) { + if (isRecoveryEnabled) { recoveryEnabled = true; - rmStore = RMStateStoreFactory.getStore(conf); + rmStore = RMStateStoreFactory.getStore(conf); + boolean isWorkPreservingRecoveryEnabled = + conf.getBoolean( + YarnConfiguration.RM_WORK_PRESERVING_RECOVERY_ENABLED, + YarnConfiguration.DEFAULT_RM_WORK_PRESERVING_RECOVERY_ENABLED); + rmContext + .setWorkPreservingRecoveryEnabled(isWorkPreservingRecoveryEnabled); } else { recoveryEnabled = false; rmStore = new NullRMStateStore(); Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java Wed Jun 11 12:20:48 2014 @@ -60,6 +60,7 @@ import org.apache.hadoop.yarn.server.res import org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeEventType; import org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl; import org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeReconnectEvent; +import org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeStartedEvent; import org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeStatusEvent; import org.apache.hadoop.yarn.server.resourcemanager.security.NMTokenSecretManagerInRM; import org.apache.hadoop.yarn.server.resourcemanager.security.RMContainerTokenSecretManager; @@ -243,11 +244,13 @@ public class ResourceTrackerService exte Resource capability = request.getResource(); String nodeManagerVersion = request.getNMVersion(); - if (!request.getNMContainerStatuses().isEmpty()) { - LOG.info("received container statuses on node manager register :" - + request.getNMContainerStatuses()); - for (NMContainerStatus report : request.getNMContainerStatuses()) { - handleNMContainerStatus(report); + if (!rmContext.isWorkPreservingRecoveryEnabled()) { + if (!request.getNMContainerStatuses().isEmpty()) { + LOG.info("received container statuses on node manager register :" + + request.getNMContainerStatuses()); + for (NMContainerStatus status : request.getNMContainerStatuses()) { + handleNMContainerStatus(status); + } } } RegisterNodeManagerResponse response = recordFactory @@ -308,7 +311,7 @@ public class ResourceTrackerService exte RMNode oldNode = this.rmContext.getRMNodes().putIfAbsent(nodeId, rmNode); if (oldNode == null) { this.rmContext.getDispatcher().getEventHandler().handle( - new RMNodeEvent(nodeId, RMNodeEventType.STARTED)); + new RMNodeStartedEvent(nodeId, request.getNMContainerStatuses())); } else { LOG.info("Reconnect from the node at: " + host); this.nmLivelinessMonitor.unregister(nodeId); Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java Wed Jun 11 12:20:48 2014 @@ -47,6 +47,8 @@ import org.apache.hadoop.yarn.proto.Yarn import org.apache.hadoop.yarn.proto.YarnServerResourceManagerServiceProtos.ApplicationStateDataProto; import org.apache.hadoop.yarn.proto.YarnServerResourceManagerServiceProtos.RMStateVersionProto; import org.apache.hadoop.yarn.security.client.RMDelegationTokenIdentifier; +import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.ApplicationAttemptStateData; +import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.ApplicationStateData; import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.RMStateVersion; import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.impl.pb.ApplicationAttemptStateDataPBImpl; import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.impl.pb.ApplicationStateDataPBImpl; @@ -314,7 +316,7 @@ public class FileSystemRMStateStore exte @Override public synchronized void storeApplicationStateInternal(ApplicationId appId, - ApplicationStateDataPBImpl appStateDataPB) throws Exception { + ApplicationStateData appStateDataPB) throws Exception { String appIdStr = appId.toString(); Path appDirPath = getAppDir(rmAppRoot, appIdStr); fs.mkdirs(appDirPath); @@ -334,7 +336,7 @@ public class FileSystemRMStateStore exte @Override public synchronized void updateApplicationStateInternal(ApplicationId appId, - ApplicationStateDataPBImpl appStateDataPB) throws Exception { + ApplicationStateData appStateDataPB) throws Exception { String appIdStr = appId.toString(); Path appDirPath = getAppDir(rmAppRoot, appIdStr); Path nodeCreatePath = getNodePath(appDirPath, appIdStr); @@ -354,7 +356,7 @@ public class FileSystemRMStateStore exte @Override public synchronized void storeApplicationAttemptStateInternal( ApplicationAttemptId appAttemptId, - ApplicationAttemptStateDataPBImpl attemptStateDataPB) + ApplicationAttemptStateData attemptStateDataPB) throws Exception { Path appDirPath = getAppDir(rmAppRoot, appAttemptId.getApplicationId().toString()); @@ -375,7 +377,7 @@ public class FileSystemRMStateStore exte @Override public synchronized void updateApplicationAttemptStateInternal( ApplicationAttemptId appAttemptId, - ApplicationAttemptStateDataPBImpl attemptStateDataPB) + ApplicationAttemptStateData attemptStateDataPB) throws Exception { Path appDirPath = getAppDir(rmAppRoot, appAttemptId.getApplicationId().toString()); Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MemoryRMStateStore.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MemoryRMStateStore.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MemoryRMStateStore.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/MemoryRMStateStore.java Wed Jun 11 12:20:48 2014 @@ -32,9 +32,9 @@ import org.apache.hadoop.yarn.api.record import org.apache.hadoop.yarn.api.records.ApplicationId; import org.apache.hadoop.yarn.exceptions.YarnRuntimeException; import org.apache.hadoop.yarn.security.client.RMDelegationTokenIdentifier; +import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.ApplicationAttemptStateData; +import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.ApplicationStateData; import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.RMStateVersion; -import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.impl.pb.ApplicationAttemptStateDataPBImpl; -import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.impl.pb.ApplicationStateDataPBImpl; import com.google.common.annotations.VisibleForTesting; @@ -80,7 +80,7 @@ public class MemoryRMStateStore extends @Override public void storeApplicationStateInternal(ApplicationId appId, - ApplicationStateDataPBImpl appStateData) + ApplicationStateData appStateData) throws Exception { ApplicationState appState = new ApplicationState(appStateData.getSubmitTime(), @@ -92,7 +92,7 @@ public class MemoryRMStateStore extends @Override public void updateApplicationStateInternal(ApplicationId appId, - ApplicationStateDataPBImpl appStateData) throws Exception { + ApplicationStateData appStateData) throws Exception { ApplicationState updatedAppState = new ApplicationState(appStateData.getSubmitTime(), appStateData.getStartTime(), @@ -112,7 +112,7 @@ public class MemoryRMStateStore extends @Override public synchronized void storeApplicationAttemptStateInternal( ApplicationAttemptId appAttemptId, - ApplicationAttemptStateDataPBImpl attemptStateData) + ApplicationAttemptStateData attemptStateData) throws Exception { Credentials credentials = null; if(attemptStateData.getAppAttemptTokens() != null){ @@ -137,7 +137,7 @@ public class MemoryRMStateStore extends @Override public synchronized void updateApplicationAttemptStateInternal( ApplicationAttemptId appAttemptId, - ApplicationAttemptStateDataPBImpl attemptStateData) + ApplicationAttemptStateData attemptStateData) throws Exception { Credentials credentials = null; if (attemptStateData.getAppAttemptTokens() != null) { Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/NullRMStateStore.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/NullRMStateStore.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/NullRMStateStore.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/NullRMStateStore.java Wed Jun 11 12:20:48 2014 @@ -25,9 +25,9 @@ import org.apache.hadoop.security.token. import org.apache.hadoop.yarn.api.records.ApplicationAttemptId; import org.apache.hadoop.yarn.api.records.ApplicationId; import org.apache.hadoop.yarn.security.client.RMDelegationTokenIdentifier; +import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.ApplicationAttemptStateData; +import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.ApplicationStateData; import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.RMStateVersion; -import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.impl.pb.ApplicationAttemptStateDataPBImpl; -import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.impl.pb.ApplicationStateDataPBImpl; @Unstable public class NullRMStateStore extends RMStateStore { @@ -54,13 +54,13 @@ public class NullRMStateStore extends RM @Override protected void storeApplicationStateInternal(ApplicationId appId, - ApplicationStateDataPBImpl appStateData) throws Exception { + ApplicationStateData appStateData) throws Exception { // Do nothing } @Override protected void storeApplicationAttemptStateInternal(ApplicationAttemptId attemptId, - ApplicationAttemptStateDataPBImpl attemptStateData) throws Exception { + ApplicationAttemptStateData attemptStateData) throws Exception { // Do nothing } @@ -102,13 +102,13 @@ public class NullRMStateStore extends RM @Override protected void updateApplicationStateInternal(ApplicationId appId, - ApplicationStateDataPBImpl appStateData) throws Exception { + ApplicationStateData appStateData) throws Exception { // Do nothing } @Override protected void updateApplicationAttemptStateInternal(ApplicationAttemptId attemptId, - ApplicationAttemptStateDataPBImpl attemptStateData) throws Exception { + ApplicationAttemptStateData attemptStateData) throws Exception { } @Override Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStore.java Wed Jun 11 12:20:48 2014 @@ -18,7 +18,6 @@ package org.apache.hadoop.yarn.server.resourcemanager.recovery; -import java.nio.ByteBuffer; import java.util.HashMap; import java.util.HashSet; import java.util.Map; @@ -31,7 +30,6 @@ import org.apache.commons.logging.LogFac import org.apache.hadoop.classification.InterfaceAudience.Private; import org.apache.hadoop.classification.InterfaceStability.Unstable; import org.apache.hadoop.conf.Configuration; -import org.apache.hadoop.io.DataOutputBuffer; import org.apache.hadoop.io.Text; import org.apache.hadoop.security.Credentials; import org.apache.hadoop.security.token.Token; @@ -50,6 +48,8 @@ import org.apache.hadoop.yarn.security.A import org.apache.hadoop.yarn.security.client.RMDelegationTokenIdentifier; import org.apache.hadoop.yarn.server.resourcemanager.RMFatalEvent; import org.apache.hadoop.yarn.server.resourcemanager.RMFatalEventType; +import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.ApplicationAttemptStateData; +import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.ApplicationStateData; import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.RMStateVersion; import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.impl.pb.ApplicationAttemptStateDataPBImpl; import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.impl.pb.ApplicationStateDataPBImpl; @@ -61,6 +61,10 @@ import org.apache.hadoop.yarn.server.res import org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptState; import org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.event.RMAppAttemptNewSavedEvent; import org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.event.RMAppAttemptUpdateSavedEvent; +import org.apache.hadoop.yarn.state.InvalidStateTransitonException; +import org.apache.hadoop.yarn.state.SingleArcTransition; +import org.apache.hadoop.yarn.state.StateMachine; +import org.apache.hadoop.yarn.state.StateMachineFactory; @Private @Unstable @@ -83,8 +87,163 @@ public abstract class RMStateStore exten public static final Log LOG = LogFactory.getLog(RMStateStore.class); + private enum RMStateStoreState { + DEFAULT + }; + + private static final StateMachineFactory + stateMachineFactory = new StateMachineFactory( + RMStateStoreState.DEFAULT) + .addTransition(RMStateStoreState.DEFAULT, RMStateStoreState.DEFAULT, + RMStateStoreEventType.STORE_APP, new StoreAppTransition()) + .addTransition(RMStateStoreState.DEFAULT, RMStateStoreState.DEFAULT, + RMStateStoreEventType.UPDATE_APP, new UpdateAppTransition()) + .addTransition(RMStateStoreState.DEFAULT, RMStateStoreState.DEFAULT, + RMStateStoreEventType.REMOVE_APP, new RemoveAppTransition()) + .addTransition(RMStateStoreState.DEFAULT, RMStateStoreState.DEFAULT, + RMStateStoreEventType.STORE_APP_ATTEMPT, new StoreAppAttemptTransition()) + .addTransition(RMStateStoreState.DEFAULT, RMStateStoreState.DEFAULT, + RMStateStoreEventType.UPDATE_APP_ATTEMPT, new UpdateAppAttemptTransition()); + + private final StateMachine stateMachine; + + private static class StoreAppTransition + implements SingleArcTransition { + @Override + public void transition(RMStateStore store, RMStateStoreEvent event) { + if (!(event instanceof RMStateStoreAppEvent)) { + // should never happen + LOG.error("Illegal event type: " + event.getClass()); + return; + } + ApplicationState appState = ((RMStateStoreAppEvent) event).getAppState(); + ApplicationId appId = appState.getAppId(); + ApplicationStateData appStateData = ApplicationStateData + .newInstance(appState); + LOG.info("Storing info for app: " + appId); + try { + store.storeApplicationStateInternal(appId, appStateData); + store.notifyDoneStoringApplication(appId, null); + } catch (Exception e) { + LOG.error("Error storing app: " + appId, e); + store.notifyStoreOperationFailed(e); + } + }; + } + + private static class UpdateAppTransition implements + SingleArcTransition { + @Override + public void transition(RMStateStore store, RMStateStoreEvent event) { + if (!(event instanceof RMStateUpdateAppEvent)) { + // should never happen + LOG.error("Illegal event type: " + event.getClass()); + return; + } + ApplicationState appState = ((RMStateUpdateAppEvent) event).getAppState(); + ApplicationId appId = appState.getAppId(); + ApplicationStateData appStateData = ApplicationStateData + .newInstance(appState); + LOG.info("Updating info for app: " + appId); + try { + store.updateApplicationStateInternal(appId, appStateData); + store.notifyDoneUpdatingApplication(appId, null); + } catch (Exception e) { + LOG.error("Error updating app: " + appId, e); + store.notifyStoreOperationFailed(e); + } + }; + } + + private static class RemoveAppTransition implements + SingleArcTransition { + @Override + public void transition(RMStateStore store, RMStateStoreEvent event) { + if (!(event instanceof RMStateStoreRemoveAppEvent)) { + // should never happen + LOG.error("Illegal event type: " + event.getClass()); + return; + } + ApplicationState appState = ((RMStateStoreRemoveAppEvent) event) + .getAppState(); + ApplicationId appId = appState.getAppId(); + LOG.info("Removing info for app: " + appId); + try { + store.removeApplicationStateInternal(appState); + } catch (Exception e) { + LOG.error("Error removing app: " + appId, e); + store.notifyStoreOperationFailed(e); + } + }; + } + + private static class StoreAppAttemptTransition implements + SingleArcTransition { + @Override + public void transition(RMStateStore store, RMStateStoreEvent event) { + if (!(event instanceof RMStateStoreAppAttemptEvent)) { + // should never happen + LOG.error("Illegal event type: " + event.getClass()); + return; + } + ApplicationAttemptState attemptState = + ((RMStateStoreAppAttemptEvent) event).getAppAttemptState(); + try { + ApplicationAttemptStateData attemptStateData = + ApplicationAttemptStateData.newInstance(attemptState); + if (LOG.isDebugEnabled()) { + LOG.debug("Storing info for attempt: " + attemptState.getAttemptId()); + } + store.storeApplicationAttemptStateInternal(attemptState.getAttemptId(), + attemptStateData); + store.notifyDoneStoringApplicationAttempt(attemptState.getAttemptId(), + null); + } catch (Exception e) { + LOG.error("Error storing appAttempt: " + attemptState.getAttemptId(), e); + store.notifyStoreOperationFailed(e); + } + }; + } + + private static class UpdateAppAttemptTransition implements + SingleArcTransition { + @Override + public void transition(RMStateStore store, RMStateStoreEvent event) { + if (!(event instanceof RMStateUpdateAppAttemptEvent)) { + // should never happen + LOG.error("Illegal event type: " + event.getClass()); + return; + } + ApplicationAttemptState attemptState = + ((RMStateUpdateAppAttemptEvent) event).getAppAttemptState(); + try { + ApplicationAttemptStateData attemptStateData = ApplicationAttemptStateData + .newInstance(attemptState); + if (LOG.isDebugEnabled()) { + LOG.debug("Updating info for attempt: " + attemptState.getAttemptId()); + } + store.updateApplicationAttemptStateInternal(attemptState.getAttemptId(), + attemptStateData); + store.notifyDoneUpdatingApplicationAttempt(attemptState.getAttemptId(), + null); + } catch (Exception e) { + LOG.error("Error updating appAttempt: " + attemptState.getAttemptId(), e); + store.notifyStoreOperationFailed(e); + } + }; + } + public RMStateStore() { super(RMStateStore.class.getName()); + stateMachine = stateMachineFactory.make(this); } /** @@ -390,10 +549,10 @@ public abstract class RMStateStore exten * application. */ protected abstract void storeApplicationStateInternal(ApplicationId appId, - ApplicationStateDataPBImpl appStateData) throws Exception; + ApplicationStateData appStateData) throws Exception; protected abstract void updateApplicationStateInternal(ApplicationId appId, - ApplicationStateDataPBImpl appStateData) throws Exception; + ApplicationStateData appStateData) throws Exception; @SuppressWarnings("unchecked") /** @@ -428,11 +587,11 @@ public abstract class RMStateStore exten */ protected abstract void storeApplicationAttemptStateInternal( ApplicationAttemptId attemptId, - ApplicationAttemptStateDataPBImpl attemptStateData) throws Exception; + ApplicationAttemptStateData attemptStateData) throws Exception; protected abstract void updateApplicationAttemptStateInternal( ApplicationAttemptId attemptId, - ApplicationAttemptStateDataPBImpl attemptStateData) throws Exception; + ApplicationAttemptStateData attemptStateData) throws Exception; /** * RMDTSecretManager call this to store the state of a delegation token @@ -596,105 +755,10 @@ public abstract class RMStateStore exten // Dispatcher related code protected void handleStoreEvent(RMStateStoreEvent event) { - if (event.getType().equals(RMStateStoreEventType.STORE_APP) - || event.getType().equals(RMStateStoreEventType.UPDATE_APP)) { - ApplicationState appState = null; - if (event.getType().equals(RMStateStoreEventType.STORE_APP)) { - appState = ((RMStateStoreAppEvent) event).getAppState(); - } else { - assert event.getType().equals(RMStateStoreEventType.UPDATE_APP); - appState = ((RMStateUpdateAppEvent) event).getAppState(); - } - - Exception storedException = null; - ApplicationStateDataPBImpl appStateData = - (ApplicationStateDataPBImpl) ApplicationStateDataPBImpl - .newApplicationStateData(appState.getSubmitTime(), - appState.getStartTime(), appState.getUser(), - appState.getApplicationSubmissionContext(), appState.getState(), - appState.getDiagnostics(), appState.getFinishTime()); - - ApplicationId appId = - appState.getApplicationSubmissionContext().getApplicationId(); - - LOG.info("Storing info for app: " + appId); - try { - if (event.getType().equals(RMStateStoreEventType.STORE_APP)) { - storeApplicationStateInternal(appId, appStateData); - notifyDoneStoringApplication(appId, storedException); - } else { - assert event.getType().equals(RMStateStoreEventType.UPDATE_APP); - updateApplicationStateInternal(appId, appStateData); - notifyDoneUpdatingApplication(appId, storedException); - } - } catch (Exception e) { - LOG.error("Error storing/updating app: " + appId, e); - notifyStoreOperationFailed(e); - } - } else if (event.getType().equals(RMStateStoreEventType.STORE_APP_ATTEMPT) - || event.getType().equals(RMStateStoreEventType.UPDATE_APP_ATTEMPT)) { - - ApplicationAttemptState attemptState = null; - if (event.getType().equals(RMStateStoreEventType.STORE_APP_ATTEMPT)) { - attemptState = - ((RMStateStoreAppAttemptEvent) event).getAppAttemptState(); - } else { - assert event.getType().equals(RMStateStoreEventType.UPDATE_APP_ATTEMPT); - attemptState = - ((RMStateUpdateAppAttemptEvent) event).getAppAttemptState(); - } - - Exception storedException = null; - Credentials credentials = attemptState.getAppAttemptCredentials(); - ByteBuffer appAttemptTokens = null; - try { - if (credentials != null) { - DataOutputBuffer dob = new DataOutputBuffer(); - credentials.writeTokenStorageToStream(dob); - appAttemptTokens = ByteBuffer.wrap(dob.getData(), 0, dob.getLength()); - } - ApplicationAttemptStateDataPBImpl attemptStateData = - (ApplicationAttemptStateDataPBImpl) ApplicationAttemptStateDataPBImpl - .newApplicationAttemptStateData(attemptState.getAttemptId(), - attemptState.getMasterContainer(), appAttemptTokens, - attemptState.getStartTime(), attemptState.getState(), - attemptState.getFinalTrackingUrl(), - attemptState.getDiagnostics(), - attemptState.getFinalApplicationStatus()); - if (LOG.isDebugEnabled()) { - LOG.debug("Storing info for attempt: " + attemptState.getAttemptId()); - } - if (event.getType().equals(RMStateStoreEventType.STORE_APP_ATTEMPT)) { - storeApplicationAttemptStateInternal(attemptState.getAttemptId(), - attemptStateData); - notifyDoneStoringApplicationAttempt(attemptState.getAttemptId(), - storedException); - } else { - assert event.getType().equals( - RMStateStoreEventType.UPDATE_APP_ATTEMPT); - updateApplicationAttemptStateInternal(attemptState.getAttemptId(), - attemptStateData); - notifyDoneUpdatingApplicationAttempt(attemptState.getAttemptId(), - storedException); - } - } catch (Exception e) { - LOG.error( - "Error storing/updating appAttempt: " + attemptState.getAttemptId(), e); - notifyStoreOperationFailed(e); - } - } else if (event.getType().equals(RMStateStoreEventType.REMOVE_APP)) { - ApplicationState appState = - ((RMStateStoreRemoveAppEvent) event).getAppState(); - ApplicationId appId = appState.getAppId(); - LOG.info("Removing info for app: " + appId); - try { - removeApplicationStateInternal(appState); - } catch (Exception e) { - LOG.error("Error removing app: " + appId, e); - notifyStoreOperationFailed(e); - } - } else { - LOG.error("Unknown RMStateStoreEvent type: " + event.getType()); + try { + this.stateMachine.doTransition(event.getType(), event); + } catch (InvalidStateTransitonException e) { + LOG.error("Can't handle this event at current state", e); } } Modified: hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java URL: http://svn.apache.org/viewvc/hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java?rev=1601869&r1=1601868&r2=1601869&view=diff ============================================================================== --- hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java (original) +++ hadoop/common/branches/HDFS-5442/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java Wed Jun 11 12:20:48 2014 @@ -49,6 +49,8 @@ import org.apache.hadoop.yarn.proto.Yarn import org.apache.hadoop.yarn.proto.YarnServerResourceManagerServiceProtos.RMStateVersionProto; import org.apache.hadoop.yarn.security.client.RMDelegationTokenIdentifier; import org.apache.hadoop.yarn.server.resourcemanager.RMZKUtils; +import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.ApplicationAttemptStateData; +import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.ApplicationStateData; import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.RMStateVersion; import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.impl.pb.ApplicationAttemptStateDataPBImpl; import org.apache.hadoop.yarn.server.resourcemanager.recovery.records.impl.pb.ApplicationStateDataPBImpl; @@ -551,7 +553,7 @@ public class ZKRMStateStore extends RMSt @Override public synchronized void storeApplicationStateInternal(ApplicationId appId, - ApplicationStateDataPBImpl appStateDataPB) throws Exception { + ApplicationStateData appStateDataPB) throws Exception { String nodeCreatePath = getNodePath(rmAppRoot, appId.toString()); if (LOG.isDebugEnabled()) { @@ -565,7 +567,7 @@ public class ZKRMStateStore extends RMSt @Override public synchronized void updateApplicationStateInternal(ApplicationId appId, - ApplicationStateDataPBImpl appStateDataPB) throws Exception { + ApplicationStateData appStateDataPB) throws Exception { String nodeUpdatePath = getNodePath(rmAppRoot, appId.toString()); if (LOG.isDebugEnabled()) { @@ -587,7 +589,7 @@ public class ZKRMStateStore extends RMSt @Override public synchronized void storeApplicationAttemptStateInternal( ApplicationAttemptId appAttemptId, - ApplicationAttemptStateDataPBImpl attemptStateDataPB) + ApplicationAttemptStateData attemptStateDataPB) throws Exception { String appDirPath = getNodePath(rmAppRoot, appAttemptId.getApplicationId().toString()); @@ -605,7 +607,7 @@ public class ZKRMStateStore extends RMSt @Override public synchronized void updateApplicationAttemptStateInternal( ApplicationAttemptId appAttemptId, - ApplicationAttemptStateDataPBImpl attemptStateDataPB) + ApplicationAttemptStateData attemptStateDataPB) throws Exception { String appIdStr = appAttemptId.getApplicationId().toString(); String appAttemptIdStr = appAttemptId.toString();