hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From acmur...@apache.org
Subject svn commit: r1496799 [2/2] - /hadoop/common/branches/branch-2.1-beta/hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html
Date Wed, 26 Jun 2013 07:21:10 GMT

Modified: hadoop/common/branches/branch-2.1-beta/hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html
URL: http://svn.apache.org/viewvc/hadoop/common/branches/branch-2.1-beta/hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html?rev=1496799&r1=1496798&r2=1496799&view=diff
==============================================================================
--- hadoop/common/branches/branch-2.1-beta/hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html (original)
+++ hadoop/common/branches/branch-2.1-beta/hadoop-common-project/hadoop-common/src/main/docs/releasenotes.html Wed Jun 26 07:21:10 2013
@@ -1,4 +1,3835 @@
 <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
+<title>Hadoop  2.1.0-beta Release Notes</title>
+<STYLE type="text/css">
+	H1 {font-family: sans-serif}
+	H2 {font-family: sans-serif; margin-left: 7mm}
+	TABLE {margin-left: 7mm}
+</STYLE>
+</head>
+<body>
+<h1>Hadoop  2.1.0-beta Release Notes</h1>
+These release notes include new developer and user-facing incompatibilities, features, and major improvements. 
+<a name="changes"/>
+<h2>Changes since Hadoop 2.0.5-alpha</h2>
+<ul>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-874">YARN-874</a>.
+     Blocker bug reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli <br>
+     <b>Tracking YARN/MR test failures after HADOOP-9421 and YARN-827</b><br>
+     <blockquote>HADOOP-9421 and YARN-827 broke some YARN/MR tests. Tracking those..</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-869">YARN-869</a>.
+     Blocker bug reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli <br>
+     <b>ResourceManagerAdministrationProtocol should neither be public(yet) nor in yarn.api</b><br>
+     <blockquote>This is a admin only api that we don't know yet if people can or should write new tools against. I am going to move it to yarn.server.api and make it @Private..</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-861">YARN-861</a>.
+     Critical bug reported by Devaraj K and fixed by Vinod Kumar Vavilapalli (nodemanager)<br>
+     <b>TestContainerManager is failing</b><br>
+     <blockquote>https://builds.apache.org/job/Hadoop-Yarn-trunk/246/
+
+{code:xml}
+Running org.apache.hadoop.yarn.server.nodemanager.containermanager.TestContainerManager
+Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 19.249 sec &lt;&lt;&lt; FAILURE!
+testContainerManagerInitialization(org.apache.hadoop.yarn.server.nodemanager.containermanager.TestContainerManager)  Time elapsed: 286 sec  &lt;&lt;&lt; FAILURE!
+junit.framework.ComparisonFailure: expected:&lt;[asf009.sp2.ygridcore.ne]t&gt; but was:&lt;[localhos]t&gt;
+	at junit.framework.Assert.assertEquals(Assert.java:85)
+
+{code}</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-854">YARN-854</a>.
+     Blocker bug reported by Ramya Sunil and fixed by Omkar Vinit Joshi <br>
+     <b>App submission fails on secure deploy</b><br>
+     <blockquote>App submission on secure cluster fails with the following exception:
+
+{noformat}
+INFO mapreduce.Job: Job jobID failed with state FAILED due to: Application applicationID failed 2 times due to AM Container for appattemptID exited with  exitCode: -1000 due to: App initialization failed (255) with output: main : command provided 0
+main : user is qa_user
+javax.security.sasl.SaslException: DIGEST-MD5: digest response format violation. Mismatched response. [Caused by org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): DIGEST-MD5: digest response format violation. Mismatched response.]
+	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
+	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
+	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
+	at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
+	at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
+	at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
+	at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:65)
+	at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:235)
+	at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:169)
+	at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.main(ContainerLocalizer.java:348)
+Caused by: org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): DIGEST-MD5: digest response format violation. Mismatched response.
+	at org.apache.hadoop.ipc.Client.call(Client.java:1298)
+	at org.apache.hadoop.ipc.Client.call(Client.java:1250)
+	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:204)
+	at $Proxy7.heartbeat(Unknown Source)
+	at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
+	... 3 more
+
+.Failing this attempt.. Failing the application.
+
+{noformat}</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-852">YARN-852</a>.
+     Minor bug reported by Chuan Liu and fixed by Chuan Liu <br>
+     <b>TestAggregatedLogFormat.testContainerLogsFileAccess fails on Windows</b><br>
+     <blockquote>The YARN unit test case fails on Windows when comparing expected message with log message in the file. The expected message constructed in the test case has two problems: 1) it uses Path.separator to concatenate path string. Path.separator is always a forward slash, which does not match the backslash used in the log message. 2) On Windows, the default file owner is Administrators group if the file is created by an Administrators user. The test expect the user to be the current user.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-851">YARN-851</a>.
+     Major bug reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi <br>
+     <b>Share NMTokens using NMTokenCache (api-based) instead of memory based approach which is used currently.</b><br>
+     <blockquote>It is a follow up ticket for YARN-694. Changing the way NMTokens are shared.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-850">YARN-850</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Rename getClusterAvailableResources to getAvailableResources in AMRMClients</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-848">YARN-848</a>.
+     Major bug reported by Hitesh Shah and fixed by Hitesh Shah <br>
+     <b>Nodemanager does not register with RM using the fully qualified hostname</b><br>
+     <blockquote>If the hostname is misconfigured to not be fully qualified ( i.e. hostname returns foo and hostname -f returns foo.bar.xyz ), the NM ends up registering with the RM using only "foo". This can create problems if DNS cannot resolve the hostname properly. 
+
+Furthermore, HDFS uses fully qualified hostnames which can end up affecting locality matches when allocating containers based on block locations. </blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-846">YARN-846</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Move pb Impl from yarn-api to yarn-common</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-841">YARN-841</a>.
+     Major sub-task reported by Siddharth Seth and fixed by Vinod Kumar Vavilapalli <br>
+     <b>Annotate and document AuxService APIs</b><br>
+     <blockquote>For users writing their own AuxServices, these APIs should be annotated and need better documentation. Also, the classes may need to move out of the NodeManager.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-840">YARN-840</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Move ProtoUtils to  yarn.api.records.pb.impl</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-839">YARN-839</a>.
+     Minor bug reported by Chuan Liu and fixed by Chuan Liu <br>
+     <b>TestContainerLaunch.testContainerEnvVariables fails on Windows</b><br>
+     <blockquote>The unit test case fails on Windows due to job id or container id was not printed out as part of the container script. Later, the test tries to read the pid from output of the file, and fails.
+
+Exception in trunk:
+{noformat}
+Running org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch
+Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 9.903 sec &lt;&lt;&lt; FAILURE!
+testContainerEnvVariables(org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch)  Time elapsed: 1307 sec  &lt;&lt;&lt; ERROR!
+java.lang.NullPointerException
+        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch.testContainerEnvVariables(TestContainerLaunch.java:278)
+        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
+        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
+        at java.lang.reflect.Method.invoke(Method.java:597)
+        at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
+        at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
+        at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
+        at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
+        at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
+{noformat}</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-837">YARN-837</a>.
+     Major sub-task reported by Zhijie Shen and fixed by Zhijie Shen <br>
+     <b>ClusterInfo.java doesn't seem to belong to org.apache.hadoop.yarn</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-834">YARN-834</a>.
+     Blocker sub-task reported by Arun C Murthy and fixed by Zhijie Shen <br>
+     <b>Review/fix annotations for yarn-client module and clearly differentiate *Async apis</b><br>
+     <blockquote>Review/fix annotations for yarn-client module</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-833">YARN-833</a>.
+     Major bug reported by Zhijie Shen and fixed by Zhijie Shen <br>
+     <b>Move Graph and VisualizeStateMachine into yarn.state package</b><br>
+     <blockquote>Graph and VisualizeStateMachine are only used by state machine, they should belong to state package.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-831">YARN-831</a>.
+     Blocker sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Remove resource min from GetNewApplicationResponse</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-829">YARN-829</a>.
+     Major bug reported by Zhijie Shen and fixed by Zhijie Shen <br>
+     <b>Rename RMTokenSelector to be RMDelegationTokenSelector</b><br>
+     <blockquote>Therefore, the name of it will be consistent with that of RMDelegationTokenIdentifier.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-828">YARN-828</a>.
+     Major bug reported by Zhijie Shen and fixed by Zhijie Shen <br>
+     <b>Remove YarnVersionAnnotation</b><br>
+     <blockquote>YarnVersionAnnotation is not used at all, and the version information can be accessed through YarnVersionInfo instead.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-827">YARN-827</a>.
+     Critical sub-task reported by Bikas Saha and fixed by Jian He <br>
+     <b>Need to make Resource arithmetic methods accessible</b><br>
+     <blockquote>org.apache.hadoop.yarn.server.resourcemanager.resource has stuff like Resources and Calculators that help compare/add resources etc. Without these users will be forced to replicate the logic, potentially incorrectly.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-826">YARN-826</a>.
+     Major sub-task reported by Zhijie Shen and fixed by Zhijie Shen <br>
+     <b>Move Clock/SystemClock to util package</b><br>
+     <blockquote>Clock/SystemClock should belong to util.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-825">YARN-825</a>.
+     Blocker sub-task reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli <br>
+     <b>Fix yarn-common javadoc annotations</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-824">YARN-824</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Add  static factory to yarn client lib interface and change it to abstract class</b><br>
+     <blockquote>Do this for AMRMClient, NMClient, YarnClient. and annotate its impl as private.
+The purpose is not to expose impl</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-823">YARN-823</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Move RMAdmin from yarn.client to yarn.client.cli and rename as RMAdminCLI</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-822">YARN-822</a>.
+     Major sub-task reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi <br>
+     <b>Rename ApplicationToken to AMRMToken</b><br>
+     <blockquote>API change. At present this token is getting used on scheduler api AMRMProtocol. Right now name wise it is little confusing as it might be useful for the application to talk to complete yarn system (RM/NM) but that is not the case after YARN-694. NM will have specific NMToken so it is better to name it as AMRMToken.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-821">YARN-821</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Rename FinishApplicationMasterRequest.setFinishApplicationStatus to setFinalApplicationStatus to be consistent with getter</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-812">YARN-812</a>.
+     Major bug reported by Ramya Sunil and fixed by Siddharth Seth <br>
+     <b>Enabling app summary logs causes 'FileNotFound' errors</b><br>
+     <blockquote>RM app summary logs have been enabled as per the default config:
+
+{noformat}
+#
+# Yarn ResourceManager Application Summary Log 
+#
+# Set the ResourceManager summary log filename
+yarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log
+# Set the ResourceManager summary log level and appender
+yarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY
+
+# Appender for ResourceManager Application Summary Log
+# Requires the following properties to be set
+#    - hadoop.log.dir (Hadoop Log directory)
+#    - yarn.server.resourcemanager.appsummary.log.file (resource manager app summary log filename)
+#    - yarn.server.resourcemanager.appsummary.logger (resource manager app summary log level and appender)
+
+log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.server.resourcemanager.appsummary.logger}
+log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=false
+log4j.appender.RMSUMMARY=org.apache.log4j.RollingFileAppender
+log4j.appender.RMSUMMARY.File=${hadoop.log.dir}/${yarn.server.resourcemanager.appsummary.log.file}
+log4j.appender.RMSUMMARY.MaxFileSize=256MB
+log4j.appender.RMSUMMARY.MaxBackupIndex=20
+log4j.appender.RMSUMMARY.layout=org.apache.log4j.PatternLayout
+log4j.appender.RMSUMMARY.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
+{noformat}
+
+This however, throws errors while running commands as non-superuser:
+{noformat}
+-bash-4.1$ hadoop dfs -ls /
+DEPRECATED: Use of this script to execute hdfs command is deprecated.
+Instead use the hdfs command for it.
+
+log4j:ERROR setFile(null,true) call failed.
+java.io.FileNotFoundException: /var/log/hadoop/hadoopqa/rm-appsummary.log (No such file or directory)
+        at java.io.FileOutputStream.openAppend(Native Method)
+        at java.io.FileOutputStream.&lt;init&gt;(FileOutputStream.java:192)
+        at java.io.FileOutputStream.&lt;init&gt;(FileOutputStream.java:116)
+        at org.apache.log4j.FileAppender.setFile(FileAppender.java:294)
+        at org.apache.log4j.RollingFileAppender.setFile(RollingFileAppender.java:207)
+        at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:165)
+        at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:307)
+        at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:172)
+        at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:104)
+        at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:842)
+        at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:768)
+        at org.apache.log4j.PropertyConfigurator.parseCatsAndRenderers(PropertyConfigurator.java:672)
+        at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:516)
+        at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:580)
+        at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
+        at org.apache.log4j.LogManager.&lt;clinit&gt;(LogManager.java:127)
+        at org.apache.log4j.Logger.getLogger(Logger.java:104)
+        at org.apache.commons.logging.impl.Log4JLogger.getLogger(Log4JLogger.java:289)
+        at org.apache.commons.logging.impl.Log4JLogger.&lt;init&gt;(Log4JLogger.java:109)
+        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
+        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
+        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
+        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
+        at org.apache.commons.logging.impl.LogFactoryImpl.createLogFromClass(LogFactoryImpl.java:1116)
+        at org.apache.commons.logging.impl.LogFactoryImpl.discoverLogImplementation(LogFactoryImpl.java:858)
+        at org.apache.commons.logging.impl.LogFactoryImpl.newInstance(LogFactoryImpl.java:604)
+        at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:336)
+        at org.apache.commons.logging.impl.LogFactoryImpl.getInstance(LogFactoryImpl.java:310)
+        at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685)
+        at org.apache.hadoop.fs.FsShell.&lt;clinit&gt;(FsShell.java:41)
+Found 1 items
+drwxr-xr-x   - hadoop   hadoop            0 2013-06-12 21:28 /user
+{noformat}</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-806">YARN-806</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Move ContainerExitStatus from yarn.api to yarn.api.records</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-805">YARN-805</a>.
+     Blocker sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Fix yarn-api javadoc annotations</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-803">YARN-803</a>.
+     Major improvement reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur (resourcemanager , scheduler)<br>
+     <b>factor out scheduler config validation from the ResourceManager to each scheduler implementation</b><br>
+     <blockquote>Per discussion in YARN-789 we should factor out from the ResourceManager class the scheduler config validations.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-799">YARN-799</a>.
+     Major bug reported by Chris Riccomini and fixed by Chris Riccomini (nodemanager)<br>
+     <b>CgroupsLCEResourcesHandler tries to write to cgroup.procs</b><br>
+     <blockquote>The implementation of
+
+bq. ./hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java
+
+Tells the container-executor to write PIDs to cgroup.procs:
+
+{code}
+  public String getResourcesOption(ContainerId containerId) {
+    String containerName = containerId.toString();
+    StringBuilder sb = new StringBuilder("cgroups=");
+
+    if (isCpuWeightEnabled()) {
+      sb.append(pathForCgroup(CONTROLLER_CPU, containerName) + "/cgroup.procs");
+      sb.append(",");
+    }
+
+    if (sb.charAt(sb.length() - 1) == ',') {
+      sb.deleteCharAt(sb.length() - 1);
+    } 
+    return sb.toString();
+  }
+{code}
+
+Apparently, this file has not always been writeable:
+
+https://patchwork.kernel.org/patch/116146/
+http://lkml.indiana.edu/hypermail/linux/kernel/1004.1/00536.html
+https://lists.linux-foundation.org/pipermail/containers/2009-July/019679.html
+
+The RHEL version of the Linux kernel that I'm using has a CGroup module that has a non-writeable cgroup.procs file.
+
+{quote}
+$ uname -a
+Linux criccomi-ld 2.6.32-131.4.1.el6.x86_64 #1 SMP Fri Jun 10 10:54:26 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
+{quote}
+
+As a result, when the container-executor tries to run, it fails with this error message:
+
+bq.    fprintf(LOGFILE, "Failed to write pid %s (%d) to file %s - %s\n",
+
+This is because the executor is given a resource by the CgroupsLCEResourcesHandler that includes cgroup.procs, which is non-writeable:
+
+{quote}
+$ pwd 
+/cgroup/cpu/hadoop-yarn/container_1370986842149_0001_01_000001
+$ ls -l
+total 0
+-r--r--r-- 1 criccomi eng 0 Jun 11 14:43 cgroup.procs
+-rw-r--r-- 1 criccomi eng 0 Jun 11 14:43 cpu.rt_period_us
+-rw-r--r-- 1 criccomi eng 0 Jun 11 14:43 cpu.rt_runtime_us
+-rw-r--r-- 1 criccomi eng 0 Jun 11 14:43 cpu.shares
+-rw-r--r-- 1 criccomi eng 0 Jun 11 14:43 notify_on_release
+-rw-r--r-- 1 criccomi eng 0 Jun 11 14:43 tasks
+{quote}
+
+I patched CgroupsLCEResourcesHandler to use /tasks instead of /cgroup.procs, and this appears to have fixed the problem.
+
+I can think of several potential resolutions to this ticket:
+
+1. Ignore the problem, and make people patch YARN when they hit this issue.
+2. Write to /tasks instead of /cgroup.procs for everyone
+3. Check permissioning on /cgroup.procs prior to writing to it, and fall back to /tasks.
+4. Add a config to yarn-site that lets admins specify which file to write to.
+
+Thoughts?</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-795">YARN-795</a>.
+     Major bug reported by Wei Yan and fixed by Wei Yan (scheduler)<br>
+     <b>Fair scheduler queue metrics should subtract allocated vCores from available vCores</b><br>
+     <blockquote>The queue metrics of fair scheduler doesn't subtract allocated vCores from available vCores, causing the available vCores returned is incorrect.
+This is happening because {code}QueueMetrics.getAllocateResources(){code} doesn't return the allocated vCores.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-792">YARN-792</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Move NodeHealthStatus from yarn.api.record to yarn.server.api.record</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-789">YARN-789</a>.
+     Major improvement reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur (scheduler)<br>
+     <b>Enable zero capabilities resource requests in fair scheduler</b><br>
+     <blockquote>Per discussion in YARN-689, reposting updated use case:
+
+1. I have a set of services co-existing with a Yarn cluster.
+
+2. These services run out of band from Yarn. They are not started as yarn containers and they don't use Yarn containers for processing.
+
+3. These services use, dynamically, different amounts of CPU and memory based on their load. They manage their CPU and memory requirements independently. In other words, depending on their load, they may require more CPU but not memory or vice-versa.
+By using YARN as RM for these services I'm able share and utilize the resources of the cluster appropriately and in a dynamic way. Yarn keeps tab of all the resources.
+
+These services run an AM that reserves resources on their behalf. When this AM gets the requested resources, the services bump up their CPU/memory utilization out of band from Yarn. If the Yarn allocations are released/preempted, the services back off on their resources utilization. By doing this, Yarn and these service correctly share the cluster resources, being Yarn RM the only one that does the overall resource bookkeeping.
+
+The services AM, not to break the lifecycle of containers, start containers in the corresponding NMs. These container processes do basically a sleep forever (i.e. sleep 10000d). They are almost not using any CPU nor memory (less than 1MB). Thus it is reasonable to assume their required CPU and memory utilization is NIL (more on hard enforcement later). Because of this almost NIL utilization of CPU and memory, it is possible to specify, when doing a request, zero as one of the dimensions (CPU or memory).
+
+The current limitation is that the increment is also the minimum. 
+
+If we set the memory increment to 1MB. When doing a pure CPU request, we would have to specify 1MB of memory. That would work. However it would allow discretionary memory requests without a desired normalization (increments of 256, 512, etc).
+
+If we set the CPU increment to 1CPU. When doing a pure memory request, we would have to specify 1CPU. CPU amounts a much smaller than memory amounts, and because we don't have fractional CPUs, it would mean that all my pure memory requests will be wasting 1 CPU thus reducing the overall utilization of the cluster.
+
+Finally, on hard enforcement. 
+
+* For CPU. Hard enforcement can be done via a cgroup cpu controller. Using an absolute minimum of a few CPU shares (ie 10) in the LinuxContainerExecutor we ensure there is enough CPU cycles to run the sleep process. This absolute minimum would only kick-in if zero is allowed, otherwise will never kick in as the shares for 1 CPU are 1024.
+
+* For Memory. Hard enforcement is currently done by the ProcfsBasedProcessTree.java, using a minimum absolute of 1 or 2 MBs would take care of zero memory resources. And again,  this absolute minimum would only kick-in if zero is allowed, otherwise will never kick in as the increment memory is in several MBs if not 1GB.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-787">YARN-787</a>.
+     Blocker sub-task reported by Alejandro Abdelnur and fixed by Alejandro Abdelnur (api)<br>
+     <b>Remove resource min from Yarn client API</b><br>
+     <blockquote>Per discussions in YARN-689 and YARN-769 we should remove minimum from the API as this is a scheduler internal thing.
+</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-782">YARN-782</a>.
+     Critical improvement reported by Sandy Ryza and fixed by Sandy Ryza (nodemanager)<br>
+     <b>vcores-pcores ratio functions differently from vmem-pmem ratio in misleading way </b><br>
+     <blockquote>The vcores-pcores ratio functions differently from the vmem-pmem ratio in the sense that the vcores-pcores ratio has an impact on allocations and the vmem-pmem ratio does not.
+
+If I double my vmem-pmem ratio, the only change that occurs is that my containers, after being scheduled, are less likely to be killed for using too much virtual memory.  But if I double my vcore-pcore ratio, my nodes will appear to the ResourceManager to contain double the amount of CPU space, which will affect scheduling decisions.
+
+The lack of consistency will exacerbate the already difficult problem of resource configuration.
+</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-781">YARN-781</a>.
+     Major sub-task reported by Devaraj Das and fixed by Jian He <br>
+     <b>Expose LOGDIR that containers should use for logging</b><br>
+     <blockquote>The LOGDIR is known. We should expose this to the container's environment.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-777">YARN-777</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Remove unreferenced objects from proto</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-773">YARN-773</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Move YarnRuntimeException from package api.yarn to api.yarn.exceptions</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-767">YARN-767</a>.
+     Major bug reported by Jian He and fixed by Jian He <br>
+     <b>Initialize Application status metrics  when QueueMetrics is initialized</b><br>
+     <blockquote>Applications: ResourceManager.QueueMetrics.AppsSubmitted, ResourceManager.QueueMetrics.AppsRunning, ResourceManager.QueueMetrics.AppsPending, ResourceManager.QueueMetrics.AppsCompleted, ResourceManager.QueueMetrics.AppsKilled, ResourceManager.QueueMetrics.AppsFailed
+For now these metrics are created only when they are needed, we want to make them be seen when QueueMetrics is initialized</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-764">YARN-764</a>.
+     Major bug reported by nemon lou and fixed by nemon lou (resourcemanager)<br>
+     <b>blank Used Resources on Capacity Scheduler page </b><br>
+     <blockquote>Even when there are jobs running,used resources is empty on Capacity Scheduler page for leaf queue.(I use google-chrome on windows 7.)
+After changing resource.java's toString method by replacing "&lt;&gt;" with "{}",this bug gets fixed.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-761">YARN-761</a>.
+     Major bug reported by Vinod Kumar Vavilapalli and fixed by Zhijie Shen <br>
+     <b>TestNMClientAsync fails sometimes</b><br>
+     <blockquote>See https://builds.apache.org/job/PreCommit-YARN-Build/1101//testReport/.
+
+It passed on my machine though.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-760">YARN-760</a>.
+     Major bug reported by Sandy Ryza and fixed by Niranjan Singh (nodemanager)<br>
+     <b>NodeManager throws AvroRuntimeException on failed start</b><br>
+     <blockquote>NodeManager wraps exceptions that occur in its start method in AvroRuntimeExceptions, even though it doesn't use Avro anywhere else.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-759">YARN-759</a>.
+     Major sub-task reported by Bikas Saha and fixed by Bikas Saha <br>
+     <b>Create Command enum in AllocateResponse</b><br>
+     <blockquote>Use command enums for shutdown/resync instead of booleans.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-757">YARN-757</a>.
+     Blocker bug reported by Bikas Saha and fixed by Bikas Saha <br>
+     <b>TestRMRestart failing/stuck on trunk</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-756">YARN-756</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Move PreemptionContainer/PremptionContract/PreemptionMessage/StrictPreemptionContract/PreemptionResourceRequest to api.records</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-755">YARN-755</a>.
+     Major sub-task reported by Bikas Saha and fixed by Bikas Saha <br>
+     <b>Rename AllocateResponse.reboot to AllocateResponse.resync</b><br>
+     <blockquote>For work preserving rm restart the am's will be resyncing instead of rebooting. rebooting is an action that currently satisfies the resync requirement. Changing the name now so that it continues to make sense in the real resync case. </blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-753">YARN-753</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Add individual factory method for api protocol records</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-752">YARN-752</a>.
+     Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (api , applications)<br>
+     <b>In AMRMClient, automatically add corresponding rack requests for requested nodes</b><br>
+     <blockquote>A ContainerRequest that includes node-level requests must also include matching rack-level requests for the racks that those nodes are on.  When a node is present without its rack, it makes sense for the client to automatically add the node's rack.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-750">YARN-750</a>.
+     Major sub-task reported by Arun C Murthy and fixed by Arun C Murthy <br>
+     <b>Allow for black-listing resources in YARN API and Impl in CS</b><br>
+     <blockquote>YARN-392 and YARN-398 enhance scheduler api to allow for white-lists of resources.
+
+This jira is a companion to allow for black-listing (in CS).</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-749">YARN-749</a>.
+     Major sub-task reported by Arun C Murthy and fixed by Arun C Murthy <br>
+     <b>Rename ResourceRequest (get,set)HostName to (get,set)ResourceName</b><br>
+     <blockquote>We should rename ResourceRequest (get,set)HostName to (get,set)ResourceName since the name can be host, rack or *.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-748">YARN-748</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Move BuilderUtils from yarn-common to yarn-server-common</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-746">YARN-746</a>.
+     Major sub-task reported by Steve Loughran and fixed by Steve Loughran <br>
+     <b>rename Service.register() and Service.unregister() to registerServiceListener() &amp; unregisterServiceListener() respectively</b><br>
+     <blockquote>make it clear what you are registering on a {{Service}} by naming the methods {{registerServiceListener()}} &amp; {{unregisterServiceListener()}} respectively.
+
+This only affects a couple of production classes; {{Service.register()}} and is used in some of the lifecycle tests of the YARN-530. There are no tests of {{Service.unregister()}}, which is something that could be corrected.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-742">YARN-742</a>.
+     Major bug reported by Kihwal Lee and fixed by Jason Lowe (nodemanager)<br>
+     <b>Log aggregation causes a lot of redundant setPermission calls</b><br>
+     <blockquote>In one of our clusters, namenode RPC is spending 45% of its time on serving setPermission calls. Further investigation has revealed that most calls are redundantly made on /mapred/logs/&lt;user&gt;/logs. Also mkdirs calls are made before this.
+</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-739">YARN-739</a>.
+     Major sub-task reported by Siddharth Seth and fixed by Omkar Vinit Joshi <br>
+     <b>NM startContainer should validate the NodeId</b><br>
+     <blockquote>The NM validates certain fields from the ContainerToken on a startContainer call. It shoudl also validate the NodeId (which needs to be added to the ContianerToken).</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-737">YARN-737</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Some Exceptions no longer need to be wrapped by YarnException and can be directly thrown out after YARN-142 </b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-735">YARN-735</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Make ApplicationAttemptID, ContainerID, NodeID immutable</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-733">YARN-733</a>.
+     Major bug reported by Zhijie Shen and fixed by Zhijie Shen <br>
+     <b>TestNMClient fails occasionally</b><br>
+     <blockquote>The problem happens at:
+{code}
+        // getContainerStatus can be called after stopContainer
+        try {
+          ContainerStatus status = nmClient.getContainerStatus(
+              container.getId(), container.getNodeId(),
+              container.getContainerToken());
+          assertEquals(container.getId(), status.getContainerId());
+          assertEquals(ContainerState.RUNNING, status.getState());
+          assertTrue("" + i, status.getDiagnostics().contains(
+              "Container killed by the ApplicationMaster."));
+          assertEquals(-1000, status.getExitStatus());
+        } catch (YarnRemoteException e) {
+          fail("Exception is not expected");
+        }
+{code}
+
+NMClientImpl#stopContainer returns, but container hasn't been stopped immediately. ContainerManangerImpl implements stopContainer in async style. Therefore, the container's status is in transition. NMClientImpl#getContainerStatus immediately after stopContainer will get either the RUNNING status or the COMPLETE one.
+
+There will be the similar problem wrt NMClientImpl#startContainer.
+</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-731">YARN-731</a>.
+     Major sub-task reported by Siddharth Seth and fixed by Zhijie Shen <br>
+     <b>RPCUtil.unwrapAndThrowException should unwrap remote RuntimeExceptions</b><br>
+     <blockquote>Will be required for YARN-662. Also, remote NPEs show up incorrectly for some unit tests.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-726">YARN-726</a>.
+     Critical bug reported by Siddharth Seth and fixed by Mayank Bansal <br>
+     <b>Queue, FinishTime fields broken on RM UI</b><br>
+     <blockquote>The queue shows up as "Invalid Date"
+Finish Time shows up as a Long value.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-724">YARN-724</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Move ProtoBase from api.records to api.records.impl.pb</b><br>
+     <blockquote>Simply move ProtoBase to records.impl.pb</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-720">YARN-720</a>.
+     Major sub-task reported by Siddharth Seth and fixed by Zhijie Shen <br>
+     <b>container-log4j.properties should not refer to mapreduce properties</b><br>
+     <blockquote>This refers to yarn.app.mapreduce.container.log.dir and yarn.app.mapreduce.container.log.filesize. This should either be moved into the MR codebase. Alternately the parameters should be renamed.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-719">YARN-719</a>.
+     Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli <br>
+     <b>Move RMIdentifier from Container to ContainerTokenIdentifier</b><br>
+     <blockquote>This needs to be done for YARN-684 to happen.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-717">YARN-717</a>.
+     Major sub-task reported by Jian He and fixed by Jian He <br>
+     <b>Copy BuilderUtil methods into token-related records</b><br>
+     <blockquote>This is separated from YARN-711,as after changing yarn.api.token from interface to abstract class, eg: ClientTokenPBImpl has to extend two classes: both TokenPBImpl and ClientToken abstract class, which is not allowed in JAVA.
+
+We may remove the ClientToken/ContainerToken/DelegationToken interface and just use the common Token interface </blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-716">YARN-716</a>.
+     Major task reported by Siddharth Seth and fixed by Siddharth Seth <br>
+     <b>Make ApplicationID immutable</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-715">YARN-715</a>.
+     Major bug reported by Siddharth Seth and fixed by Vinod Kumar Vavilapalli <br>
+     <b>TestDistributedShell and TestUnmanagedAMLauncher are failing</b><br>
+     <blockquote>Tests are timing out. Looks like this is related to YARN-617.
+{code}
+2013-05-21 17:40:23,693 ERROR [IPC Server handler 0 on 54024] containermanager.ContainerManagerImpl (ContainerManagerImpl.java:authorizeRequest(412)) - Unauthorized request to start container.
+Expected containerId: user Found: container_1369183214008_0001_01_000001
+2013-05-21 17:40:23,694 ERROR [IPC Server handler 0 on 54024] security.UserGroupInformation (UserGroupInformation.java:doAs(1492)) - PriviledgedActionException as:user (auth:SIMPLE) cause:org.apache.hado
+Expected containerId: user Found: container_1369183214008_0001_01_000001
+2013-05-21 17:40:23,695 INFO  [IPC Server handler 0 on 54024] ipc.Server (Server.java:run(1864)) - IPC Server handler 0 on 54024, call org.apache.hadoop.yarn.api.ContainerManagerPB.startContainer from 10.
+Expected containerId: user Found: container_1369183214008_0001_01_000001
+org.apache.hadoop.yarn.exceptions.YarnRemoteException: Unauthorized request to start container.
+Expected containerId: user Found: container_1369183214008_0001_01_000001
+  at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:43)
+  at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.authorizeRequest(ContainerManagerImpl.java:413)
+  at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainer(ContainerManagerImpl.java:440)
+  at org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagerPBServiceImpl.startContainer(ContainerManagerPBServiceImpl.java:72)
+  at org.apache.hadoop.yarn.proto.ContainerManager$ContainerManagerService$2.callBlockingMethod(ContainerManager.java:83)
+  at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:527)
+{code}</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-714">YARN-714</a>.
+     Major sub-task reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi <br>
+     <b>AMRM protocol changes for sending NMToken list</b><br>
+     <blockquote>NMToken will be sent to AM on allocate call if
+1) AM doesn't already have NMToken for the underlying NM
+2) Key rolled over on RM and AM gets new container on the same NM.
+On allocate call RM will send a consolidated list of all required NMTokens.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-711">YARN-711</a>.
+     Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Jian He <br>
+     <b>Copy BuilderUtil methods into individual records</b><br>
+     <blockquote>BuilderUtils is one giant utils class which has all the factory methods needed for creating records. It is painful for users to figure out how to create records. We are better off having the factories in each record, that way users can easily create records.
+
+As a first step, we should just copy all the factory methods into individual classes, deprecate BuilderUtils and then slowly move all code off BuilderUtils.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-708">YARN-708</a>.
+     Major task reported by Siddharth Seth and fixed by Siddharth Seth <br>
+     <b>Move RecordFactory classes to hadoop-yarn-api, miscellaneous fixes to the interfaces</b><br>
+     <blockquote>This is required for additional changes in YARN-528. 
+Some of the interfaces could use some cleanup as well - they shouldn't be declaring YarnException (Runtime) in their signature.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-706">YARN-706</a>.
+     Major bug reported by Zhijie Shen and fixed by Zhijie Shen <br>
+     <b>Race Condition in TestFSDownload</b><br>
+     <blockquote>See the test failure in YARN-695
+
+https://builds.apache.org/job/PreCommit-YARN-Build/957//testReport/org.apache.hadoop.yarn.util/TestFSDownload/testDownloadPatternJar/</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-700">YARN-700</a>.
+     Major bug reported by Ivan Mitic and fixed by Ivan Mitic <br>
+     <b>TestInfoBlock fails on Windows because of line ending missmatch</b><br>
+     <blockquote>Exception:
+{noformat}
+Running org.apache.hadoop.yarn.webapp.view.TestInfoBlock
+Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.962 sec &lt;&lt;&lt; FAILURE!
+testMultilineInfoBlock(org.apache.hadoop.yarn.webapp.view.TestInfoBlock)  Time elapsed: 873 sec  &lt;&lt;&lt; FAILURE!
+java.lang.AssertionError: 
+	at org.junit.Assert.fail(Assert.java:91)
+	at org.junit.Assert.assertTrue(Assert.java:43)
+	at org.junit.Assert.assertTrue(Assert.java:54)
+	at org.apache.hadoop.yarn.webapp.view.TestInfoBlock.testMultilineInfoBlock(TestInfoBlock.java:79)
+	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
+	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
+	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
+	at java.lang.reflect.Method.invoke(Method.java:597)
+	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
+	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
+	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
+	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
+	at org.junit.internal.runners.statements.FailOnTimeout$1.run(FailOnTimeout.java:28)
+{noformat}</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-695">YARN-695</a>.
+     Major sub-task reported by Zhijie Shen and fixed by Zhijie Shen <br>
+     <b>masterContainer and status are in ApplicationReportProto but not in ApplicationReport</b><br>
+     <blockquote>If masterContainer and status are no longer part of ApplicationReport, they should be removed from proto as well.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-694">YARN-694</a>.
+     Major bug reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi <br>
+     <b>Start using NMTokens to authenticate all communication with NM</b><br>
+     <blockquote>AM uses the NMToken to authenticate all the AM-NM communication.
+NM will validate NMToken in below manner
+* If NMToken is using current or previous master key then the NMToken is valid. In this case it will update its cache with this key corresponding to appId.
+* If NMToken is using the master key which is present in NM's cache corresponding to AM's appId then it will be validated based on this.
+* If NMToken is invalid then NM will reject AM calls.
+
+Modification for ContainerToken
+* At present RPC validates AM-NM communication based on ContainerToken. It will be replaced with NMToken. Also now onwards AM will use NMToken per NM (replacing earlier behavior of ContainerToken per container per NM).
+* startContainer in case of Secured environment is using ContainerToken from UGI YARN-617; however after this it will use it from the payload (Container).
+* ContainerToken will exist and it will only be used to validate the AM's container start request.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-693">YARN-693</a>.
+     Major bug reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi <br>
+     <b>Sending NMToken to AM on allocate call</b><br>
+     <blockquote>This is part of YARN-613.
+As per the updated design, AM will receive per NM, NMToken in following scenarios
+* AM is receiving first container on underlying NM.
+* AM is receiving container on underlying NM after either NM or RM rebooted.
+** After RM reboot, as RM doesn't remember (persist) the information about keys issued per AM per NM, it will reissue tokens in case AM gets new container on underlying NM. However on NM side NM will still retain older token until it receives new token to support long running jobs (in work preserving environment).
+** After NM reboot, RM will delete the token information corresponding to that AM for all AMs.
+* AM is receiving container on underlying NM after NMToken master key is rolled over on RM side.
+In all the cases if AM receives new NMToken then it is suppose to store it for future NM communication until it receives a new one.
+
+AMRMClient should expose these NMToken to client. </blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-692">YARN-692</a>.
+     Major bug reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi <br>
+     <b>Creating NMToken master key on RM and sharing it with NM as a part of RM-NM heartbeat.</b><br>
+     <blockquote>This is related to YARN-613 . Here we will be implementing NMToken generation on RM side and sharing it with NM during RM-NM heartbeat. As a part of this JIRA mater key will only be made available to NM but there will be no validation done until AM-NM communication is fixed.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-690">YARN-690</a>.
+     Blocker bug reported by Daryn Sharp and fixed by Daryn Sharp (resourcemanager)<br>
+     <b>RM exits on token cancel/renew problems</b><br>
+     <blockquote>The DelegationTokenRenewer thread is critical to the RM.  When a non-IOException occurs, the thread calls System.exit to prevent the RM from running w/o the thread.  It should be exiting only on non-RuntimeExceptions.
+
+The problem is especially bad in 23 because the yarn protobuf layer converts IOExceptions into UndeclaredThrowableExceptions (RuntimeException) which causes the renewer to abort the process.  An UnknownHostException takes down the RM...</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-686">YARN-686</a>.
+     Major sub-task reported by Sandy Ryza and fixed by Sandy Ryza (api)<br>
+     <b>Flatten NodeReport</b><br>
+     <blockquote>The NodeReport returned by getClusterNodes or given to AMs in heartbeat responses includes both a NodeState (enum) and a NodeHealthStatus (object).  As UNHEALTHY is already NodeState, a separate NodeHealthStatus doesn't seem necessary.  I propose eliminating NodeHealthStatus#getIsNodeHealthy and moving its two other methods, getHealthReport and getLastHealthReportTime, into NodeReport.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-684">YARN-684</a>.
+     Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli <br>
+     <b>ContainerManager.startContainer needs to only have ContainerTokenIdentifier instead of the whole Container</b><br>
+     <blockquote>The NM only needs the token, the whole Container is unnecessary.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-663">YARN-663</a>.
+     Major sub-task reported by Xuan Gong and fixed by Xuan Gong <br>
+     <b>Change ResourceTracker API and LocalizationProtocol API to throw YarnRemoteException and IOException</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-660">YARN-660</a>.
+     Major sub-task reported by Bikas Saha and fixed by Bikas Saha <br>
+     <b>Improve AMRMClient with matching requests</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-655">YARN-655</a>.
+     Major bug reported by Sandy Ryza and fixed by Sandy Ryza (scheduler)<br>
+     <b>Fair scheduler metrics should subtract allocated memory from available memory</b><br>
+     <blockquote>In the scheduler web UI, cluster metrics reports that the "Memory Total" goes up when an application is allocated resources.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-651">YARN-651</a>.
+     Major sub-task reported by Xuan Gong and fixed by Xuan Gong <br>
+     <b>Change ContainerManagerPBClientImpl and RMAdminProtocolPBClientImpl to throw IOException and YarnRemoteException</b><br>
+     <blockquote>YARN-632 AND YARN-633 changes RMAdmin and ContainerManager api to throw YarnRemoteException and IOException. RMAdminProtocolPBClientImpl and ContainerManagerPBClientImpl should do the same changes</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-648">YARN-648</a>.
+     Major bug reported by Karthik Kambatla and fixed by Karthik Kambatla (scheduler)<br>
+     <b>FS: Add documentation for pluggable policy</b><br>
+     <blockquote>YARN-469 and YARN-482 make the scheduling policy in FS pluggable. Need to add documentation on how to use this.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-646">YARN-646</a>.
+     Major bug reported by Dapeng Sun and fixed by Dapeng Sun (documentation)<br>
+     <b>Some issues in Fair Scheduler's document</b><br>
+     <blockquote>Issues are found in the doc page for Fair Scheduler http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html:
+1.In the section &#8220;Configuration&#8221;, It contains two properties named &#8220;yarn.scheduler.fair.minimum-allocation-mb&#8221;, the second one should be &#8220;yarn.scheduler.fair.maximum-allocation-mb&#8221;
+2.In the section &#8220;Allocation file format&#8221;, the document tells &#8220; The format contains three types of elements&#8221;, but it lists four types of elements following that.
+</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-645">YARN-645</a>.
+     Major bug reported by Jian He and fixed by Jian He <br>
+     <b>Move RMDelegationTokenSecretManager from yarn-server-common to yarn-server-resourcemanager</b><br>
+     <blockquote>RMDelegationTokenSecretManager is specific to resource manager, should not belong to server-common</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-642">YARN-642</a>.
+     Major bug reported by Sandy Ryza and fixed by Sandy Ryza (api , resourcemanager)<br>
+     <b>Fix up /nodes REST API to have 1 param and be consistent with the Java API</b><br>
+     <blockquote>The code behind the /nodes RM REST API is unnecessarily muddled, logs the same misspelled INFO message repeatedly, and does not return unhealthy nodes, even when asked.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-639">YARN-639</a>.
+     Major bug reported by Zhijie Shen and fixed by Zhijie Shen (applications/distributed-shell)<br>
+     <b>Make AM of Distributed Shell Use NMClient</b><br>
+     <blockquote>YARN-422 adds NMClient. AM of Distributed Shell should use it instead of using ContainerManager directly.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-638">YARN-638</a>.
+     Major sub-task reported by Jian He and fixed by Jian He (resourcemanager)<br>
+     <b>Restore RMDelegationTokens after RM Restart</b><br>
+     <blockquote>This is missed in YARN-581. After RM restart, RMDelegationTokens need to be added both in DelegationTokenRenewer (addressed in YARN-581), and delegationTokenSecretManager</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-637">YARN-637</a>.
+     Major bug reported by Karthik Kambatla and fixed by Karthik Kambatla (scheduler)<br>
+     <b>FS: maxAssign is not honored</b><br>
+     <blockquote>maxAssign limits the number of containers that can be assigned in a single heartbeat. Currently, FS doesn't keep track of number of assigned containers to check this.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-635">YARN-635</a>.
+     Major sub-task reported by Xuan Gong and fixed by Siddharth Seth <br>
+     <b>Rename YarnRemoteException to YarnException</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-634">YARN-634</a>.
+     Major sub-task reported by Siddharth Seth and fixed by Siddharth Seth <br>
+     <b>Make YarnRemoteException not backed by PB and introduce a SerializedException</b><br>
+     <blockquote>LocalizationProtocol sends an exception over the wire. This currently uses YarnRemoteException. Post YARN-627, this needs to be changed and a new serialized exception is required.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-633">YARN-633</a>.
+     Major sub-task reported by Xuan Gong and fixed by Xuan Gong <br>
+     <b>Change RMAdminProtocol api to throw IOException and YarnRemoteException</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-632">YARN-632</a>.
+     Major sub-task reported by Xuan Gong and fixed by Xuan Gong <br>
+     <b>Change ContainerManager api to throw IOException and YarnRemoteException</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-631">YARN-631</a>.
+     Major sub-task reported by Xuan Gong and fixed by Xuan Gong <br>
+     <b>Change ClientRMProtocol api to throw IOException and YarnRemoteException</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-630">YARN-630</a>.
+     Major sub-task reported by Xuan Gong and fixed by Xuan Gong <br>
+     <b>Change AMRMProtocol api to throw IOException and YarnRemoteException</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-629">YARN-629</a>.
+     Major sub-task reported by Xuan Gong and fixed by Xuan Gong <br>
+     <b>Make YarnRemoteException not be rooted at IOException</b><br>
+     <blockquote>After HADOOP-9343, it should be possible for YarnException to not be rooted at IOException</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-628">YARN-628</a>.
+     Major sub-task reported by Siddharth Seth and fixed by Siddharth Seth <br>
+     <b>Fix YarnException unwrapping</b><br>
+     <blockquote>Unwrapping of YarnRemoteExceptions (currently in YarnRemoteExceptionPBImpl, RPCUtil post YARN-625) is broken, and often ends up throwin UndeclaredThrowableException. This needs to be fixed.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-625">YARN-625</a>.
+     Major sub-task reported by Siddharth Seth and fixed by Siddharth Seth <br>
+     <b>Move unwrapAndThrowException from YarnRemoteExceptionPBImpl to RPCUtil</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-618">YARN-618</a>.
+     Major bug reported by Jian He and fixed by Jian He <br>
+     <b>Modify RM_INVALID_IDENTIFIER to  a -ve number</b><br>
+     <blockquote>RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 0. Probably a -ve number is what we want.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-617">YARN-617</a>.
+     Minor sub-task reported by Vinod Kumar Vavilapalli and fixed by Omkar Vinit Joshi <br>
+     <b>In unsercure mode, AM can fake resource requirements </b><br>
+     <blockquote>Without security, it is impossible to completely avoid AMs faking resources. We can at the least make it as difficult as possible by using the same container tokens and the RM-NM shared key mechanism over unauthenticated RM-NM channel.
+
+In the minimum, this will avoid accidental bugs in AMs in unsecure mode.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-615">YARN-615</a>.
+     Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli <br>
+     <b>ContainerLaunchContext.containerTokens should simply be called tokens</b><br>
+     <blockquote>ContainerToken is the name of the specific token that AMs use to launch containers on NMs, so we should rename CLC.containerTokens to be simply tokens.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-613">YARN-613</a>.
+     Major sub-task reported by Bikas Saha and fixed by Omkar Vinit Joshi <br>
+     <b>Create NM proxy per NM instead of per container</b><br>
+     <blockquote>Currently a new NM proxy has to be created per container since the secure authentication is using a containertoken from the container.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-610">YARN-610</a>.
+     Blocker sub-task reported by Siddharth Seth and fixed by Omkar Vinit Joshi <br>
+     <b>ClientToken (ClientToAMToken) should not be set in the environment</b><br>
+     <blockquote>Similar to YARN-579, this can be set via ContainerTokens</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-605">YARN-605</a>.
+     Major bug reported by Hitesh Shah and fixed by Hitesh Shah <br>
+     <b>Failing unit test in TestNMWebServices when using git for source control </b><br>
+     <blockquote>Failed tests:   testNode(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789
+  testNodeSlash(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789
+  testNodeDefault(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789
+  testNodeInfo(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789
+  testNodeInfoSlash(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789
+  testNodeInfoDefault(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789
+  testSingleNodesXML(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-600">YARN-600</a>.
+     Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (resourcemanager , scheduler)<br>
+     <b>Hook up cgroups CPU settings to the number of virtual cores allocated</b><br>
+     <blockquote>YARN-3 introduced CPU isolation and monitoring through cgroups.  YARN-2 and introduced CPU scheduling in the capacity scheduler, and YARN-326 will introduce it in the fair scheduler.  The number of virtual cores allocated to a container should be used to weight the number of cgroups CPU shares given to it.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-599">YARN-599</a>.
+     Major bug reported by Zhijie Shen and fixed by Zhijie Shen <br>
+     <b>Refactoring submitApplication in ClientRMService and RMAppManager</b><br>
+     <blockquote>Currently, ClientRMService#submitApplication call RMAppManager#handle, and consequently call RMAppMangager#submitApplication directly, though the code looks like scheduling an APP_SUBMIT event.
+
+In addition, the validation code before creating an RMApp instance is not well organized. Ideally, the dynamic validation, which depends on the RM's configuration, should be put in RMAppMangager#submitApplication. RMAppMangager#submitApplication is called by ClientRMService#submitApplication and RMAppMangager#recover. Since the configuration may be changed after RM restarts, the validation needs to be done again even in recovery mode. Therefore, resource request validation, which based on min/max resource limits, should be moved from ClientRMService#submitApplication to RMAppMangager#submitApplication. On the other hand, the static validation, which is independent of the RM's configuration should be put in ClientRMService#submitApplication, because it is only need to be done once during the first submission.
+
+Furthermore, try-catch flow in RMAppMangager#submitApplication has a flaw. RMAppMangager#submitApplication has a flaw is not synchronized. If two application submissions with the same application ID enter the function, and one progresses to the completion of RMApp instantiation, and the other progresses the completion of putting the RMApp instance into rmContext, the slower submission will cause an exception due to the duplicate application ID. However, the exception will cause the RMApp instance already in rmContext (belongs to the faster submission) being rejected with the current code flow.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-598">YARN-598</a>.
+     Major improvement reported by Sandy Ryza and fixed by Sandy Ryza (resourcemanager , scheduler)<br>
+     <b>Add virtual cores to queue metrics</b><br>
+     <blockquote>QueueMetrics includes allocatedMB, availableMB, pendingMB, reservedMB.  It should have equivalents for CPU.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-597">YARN-597</a>.
+     Major bug reported by Ivan Mitic and fixed by Ivan Mitic <br>
+     <b>TestFSDownload fails on Windows because of dependencies on tar/gzip/jar tools</b><br>
+     <blockquote>{{testDownloadArchive}}, {{testDownloadPatternJar}} and {{testDownloadArchiveZip}} fail with the similar Shell ExitCodeException:
+
+{code}
+testDownloadArchiveZip(org.apache.hadoop.yarn.util.TestFSDownload)  Time elapsed: 480 sec  &lt;&lt;&lt; ERROR!
+org.apache.hadoop.util.Shell$ExitCodeException: bash: line 0: cd: /D:/svn/t/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/target/TestFSDownload: No such file or directory
+gzip: 1: No such file or directory
+
+	at org.apache.hadoop.util.Shell.runCommand(Shell.java:377)
+	at org.apache.hadoop.util.Shell.run(Shell.java:292)
+	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:497)
+	at org.apache.hadoop.yarn.util.TestFSDownload.createZipFile(TestFSDownload.java:225)
+	at org.apache.hadoop.yarn.util.TestFSDownload.testDownloadArchiveZip(TestFSDownload.java:503)
+{code}</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-595">YARN-595</a>.
+     Major sub-task reported by Sandy Ryza and fixed by Sandy Ryza (scheduler)<br>
+     <b>Refactor fair scheduler to use common Resources</b><br>
+     <blockquote>resourcemanager.fair and resourcemanager.resources have two copies of basically the same code for operations on Resource objects</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-594">YARN-594</a>.
+     Major bug reported by Jian He and fixed by Jian He <br>
+     <b>Update test and add comments in YARN-534</b><br>
+     <blockquote>This jira is simply to add some comments in the patch YARN-534 and update the test case</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-593">YARN-593</a>.
+     Major bug reported by Chris Nauroth and fixed by Chris Nauroth (nodemanager)<br>
+     <b>container launch on Windows does not correctly populate classpath with new process's environment variables and localized resources</b><br>
+     <blockquote>On Windows, we must bundle the classpath of a launched container in an intermediate jar with a manifest.  Currently, this logic incorrectly uses the nodemanager process's environment variables for substitution.  Instead, it needs to use the new environment for the launched process.  Also, the bundled classpath is missing some localized resources for directories, due to a quirk in the way {{File#toURI}} decides whether or not to append a trailing '/'.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-591">YARN-591</a>.
+     Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli <br>
+     <b>RM recovery related records do not belong to the API</b><br>
+     <blockquote>We need to move out AppliationStateData and ApplicationAttemptStateData into resourcemanager module. They are not part of the public API..</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-590">YARN-590</a>.
+     Major improvement reported by Vinod Kumar Vavilapalli and fixed by Mayank Bansal <br>
+     <b>Add an optional mesage to RegisterNodeManagerResponse as to why NM is being asked to resync or shutdown</b><br>
+     <blockquote>We should log such message in NM itself. Helps in debugging issues on NM directly instead of distributed debugging between RM and NM when such an action is received from RM.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-586">YARN-586</a>.
+     Trivial bug reported by Zhijie Shen and fixed by Zhijie Shen <br>
+     <b>Typo in ApplicationSubmissionContext#setApplicationId</b><br>
+     <blockquote>The parameter should be applicationId instead of appplicationId</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-585">YARN-585</a>.
+     Major bug reported by Zhijie Shen and fixed by Zhijie Shen <br>
+     <b>TestFairScheduler#testNotAllowSubmitApplication is broken due to YARN-514</b><br>
+     <blockquote>TestFairScheduler#testNotAllowSubmitApplication is broken due to YARN-514. See the discussions in YARN-514.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-583">YARN-583</a>.
+     Major sub-task reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi <br>
+     <b>Application cache files should be localized under local-dir/usercache/userid/appcache/appid/filecache</b><br>
+     <blockquote>Currently application cache files are getting localized under local-dir/usercache/userid/appcache/appid/. however they should be localized under filecache sub directory.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-582">YARN-582</a>.
+     Major sub-task reported by Bikas Saha and fixed by Jian He (resourcemanager)<br>
+     <b>Restore appToken and clientToken for app attempt after RM restart</b><br>
+     <blockquote>These need to be saved and restored on a per app attempt basis. This is required only when work preserving restart is implemented for secure clusters. In non-preserving restart app attempts are killed and so this does not matter.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-581">YARN-581</a>.
+     Major sub-task reported by Bikas Saha and fixed by Jian He (resourcemanager)<br>
+     <b>Test and verify that app delegation tokens are added to tokenRenewer after RM restart</b><br>
+     <blockquote>The code already saves the delegation tokens in AppSubmissionContext. Upon restart the AppSubmissionContext is used to submit the application again and so restores the delegation tokens. This jira tracks testing and verifying this functionality in a secure setup.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-579">YARN-579</a>.
+     Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Vinod Kumar Vavilapalli <br>
+     <b>Make ApplicationToken part of Container's token list to help RM-restart</b><br>
+     <blockquote>Container is already persisted for helping RM restart. Instead of explicitly setting ApplicationToken in AM's env, if we change it to be in Container, we can avoid env and can also help restart.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-578">YARN-578</a>.
+     Major sub-task reported by Vinod Kumar Vavilapalli and fixed by Omkar Vinit Joshi (nodemanager)<br>
+     <b>NodeManager should use SecureIOUtils for serving and aggregating logs</b><br>
+     <blockquote>Log servlets for serving logs and the ShuffleService for serving intermediate outputs both should use SecureIOUtils for avoiding symlink attacks.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-577">YARN-577</a>.
+     Major sub-task reported by Hitesh Shah and fixed by Hitesh Shah <br>
+     <b>ApplicationReport does not provide progress value of application</b><br>
+     <blockquote>An application sends its progress % to the RM via AllocateRequest. This should be able to be retrieved by a client via the ApplicationReport.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-576">YARN-576</a>.
+     Major bug reported by Hitesh Shah and fixed by Kenji Kikushima <br>
+     <b>RM should not allow registrations from NMs that do not satisfy minimum scheduler allocations</b><br>
+     <blockquote>If the minimum resource allocation configured for the RM scheduler is 1 GB, the RM should drop all NMs that register with a total capacity of less than 1 GB. </blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-571">YARN-571</a>.
+     Major sub-task reported by Hitesh Shah and fixed by Omkar Vinit Joshi <br>
+     <b>User should not be part of ContainerLaunchContext</b><br>
+     <blockquote>Today, a user is expected to set the user name in the CLC when either submitting an application or launching a container from the AM. This does not make sense as the user can/has been identified by the RM as part of the RPC layer.
+
+Solution would be to move the user information into either the Container object or directly into the ContainerToken which can then be used by the NM to launch the container. This user information would set into the container by the RM.
+
+</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-568">YARN-568</a>.
+     Major improvement reported by Carlo Curino and fixed by Carlo Curino (scheduler)<br>
+     <b>FairScheduler: support for work-preserving preemption </b><br>
+     <blockquote>In the attached patch, we modified  the FairScheduler to substitute its preemption-by-killling with a work-preserving version of preemption (followed by killing if the AMs do not respond quickly enough). This should allows to run preemption checking more often, but kill less often (proper tuning to be investigated).  Depends on YARN-567 and YARN-45, is related to YARN-569.
+</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-567">YARN-567</a>.
+     Major sub-task reported by Carlo Curino and fixed by Carlo Curino (resourcemanager)<br>
+     <b>RM changes to support preemption for FairScheduler and CapacityScheduler</b><br>
+     <blockquote>A common tradeoff in scheduling jobs is between keeping the cluster busy and enforcing capacity/fairness properties. FairScheduler and CapacityScheduler takes opposite stance on how to achieve this. 
+
+The FairScheduler, leverages task-killing to quickly reclaim resources from currently running jobs and redistributing them among new jobs, thus keeping the cluster busy but waste useful work. The CapacityScheduler is typically tuned
+to limit the portion of the cluster used by each queue so that the likelihood of violating capacity is low, thus never wasting work, but risking to keep the cluster underutilized or have jobs waiting to obtain their rightful capacity. 
+
+By introducing the notion of a work-preserving preemption we can remove this tradeoff.  This requires a protocol for preemption (YARN-45), and ApplicationMasters that can answer to preemption  efficiently (e.g., by saving their intermediate state, this will be posted for MapReduce in a separate JIRA soon), together with a scheduler that can issues preemption requests (discussed in separate JIRAs YARN-568 and YARN-569).
+
+The changes we track with this JIRA are common to FairScheduler and CapacityScheduler, and are mostly propagation of preemption decisions through the ApplicationMastersService.
+</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-563">YARN-563</a>.
+     Major sub-task reported by Thomas Weise and fixed by Mayank Bansal <br>
+     <b>Add application type to ApplicationReport </b><br>
+     <blockquote>This field is needed to distinguish different types of applications (app master implementations). For example, we may run applications of type XYZ in a cluster alongside MR and would like to filter applications by type.
+</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-562">YARN-562</a>.
+     Major sub-task reported by Jian He and fixed by Jian He (resourcemanager)<br>
+     <b>NM should reject containers allocated by previous RM</b><br>
+     <blockquote>Its possible that after RM shutdown, before AM goes down,AM still call startContainer on NM with containers allocated by previous RM. When RM comes back, NM doesn't know whether this container launch request comes from previous RM or the current RM. we should reject containers allocated by previous RM </blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-561">YARN-561</a>.
+     Major sub-task reported by Hitesh Shah and fixed by Xuan Gong <br>
+     <b>Nodemanager should set some key information into the environment of every container that it launches.</b><br>
+     <blockquote>Information such as containerId, nodemanager hostname, nodemanager port is not set in the environment when any container is launched. 
+
+For an AM, the RM does all of this for it but for a container launched by an application, all of the above need to be set by the ApplicationMaster. 
+
+At the minimum, container id would be a useful piece of information. If the container wishes to talk to its local NM, the nodemanager related information would also come in handy. </blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-557">YARN-557</a>.
+     Major bug reported by Chris Nauroth and fixed by Chris Nauroth (applications)<br>
+     <b>TestUnmanagedAMLauncher fails on Windows</b><br>
+     <blockquote>{{TestUnmanagedAMLauncher}} fails on Windows due to attempting to run a Unix-specific command in distributed shell and use of a Unix-specific environment variable to determine username for the {{ContainerLaunchContext}}.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-553">YARN-553</a>.
+     Minor sub-task reported by Harsh J and fixed by Karthik Kambatla (client)<br>
+     <b>Have YarnClient generate a directly usable ApplicationSubmissionContext</b><br>
+     <blockquote>Right now, we're doing multiple steps to create a relevant ApplicationSubmissionContext for a pre-received GetNewApplicationResponse.
+
+{code}
+    GetNewApplicationResponse newApp = yarnClient.getNewApplication();
+    ApplicationId appId = newApp.getApplicationId();
+
+    ApplicationSubmissionContext appContext = Records.newRecord(ApplicationSubmissionContext.class);
+
+    appContext.setApplicationId(appId);
+{code}
+
+A simplified way may be to have the GetNewApplicationResponse itself provide a helper method that builds a usable ApplicationSubmissionContext for us. Something like:
+
+{code}
+GetNewApplicationResponse newApp = yarnClient.getNewApplication();
+ApplicationSubmissionContext appContext = newApp.generateApplicationSubmissionContext();
+{code}
+
+[The above method can also take an arg for the container launch spec, or perhaps pre-load defaults like min-resource, etc. in the returned object, aside of just associating the application ID automatically.]</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-549">YARN-549</a>.
+     Major sub-task reported by Zhijie Shen and fixed by Zhijie Shen <br>
+     <b>YarnClient.submitApplication should wait for application to be accepted by the RM</b><br>
+     <blockquote>Currently, when submitting an application, storeApplication will be called for recovery. However, it is a blocking API, and is likely to block concurrent application submissions. Therefore, it is good to make application submission asynchronous, and postpone storeApplication. YarnClient needs to change to wait for the whole operation to complete so that clients can be notified after the application is really submitted. YarnClient needs to wait for application to reach SUBMITTED state or beyond.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-548">YARN-548</a>.
+     Major sub-task reported by Vadim Bondarev and fixed by Vadim Bondarev <br>
+     <b>Add tests for YarnUncaughtExceptionHandler</b><br>
+     <blockquote></blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-547">YARN-547</a>.
+     Major sub-task reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi <br>
+     <b>Race condition in Public / Private Localizer may result into resource getting downloaded again</b><br>
+     <blockquote>Public Localizer :
+At present when multiple containers try to request a localized resource 
+* If the resource is not present then first it is created and Resource Localization starts ( LocalizedResource is in DOWNLOADING state)
+* Now if in this state multiple ResourceRequestEvents arrive then ResourceLocalizationEvents are sent for all of them.
+
+Most of the times it is not resulting into a duplicate resource download but there is a race condition present there. Inside ResourceLocalization (for public download) all the requests are added to local attempts map. If a new request comes in then first it is checked in this map before a new download starts for the same. For the current download the request will be there in the map. Now if a same resource request comes in then it will rejected (i.e. resource is getting downloaded already). However if the current download completes then the request will be removed from this local map. Now after this removal if the LocalizerRequestEvent comes in then as it is not present in local map the resource will be downloaded again.
+
+PrivateLocalizer :
+Here a different but similar race condition is present.
+* Here inside findNextResource method call; each LocalizerRunner tries to grab a lock on LocalizerResource. If the lock is not acquired then it will keep trying until the resource state changes to LOCALIZED. This lock will be released by the LocalizerRunner when download completes.
+* Now if another ContainerLocalizer tries to grab the lock on a resource before LocalizedResource state changes to LOCALIZED then resource will be downloaded again.
+
+At both the places the root cause of this is that all the threads try to acquire the lock on resource however current state of the LocalizedResource is not taken into consideration.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-542">YARN-542</a>.
+     Major bug reported by Vinod Kumar Vavilapalli and fixed by Zhijie Shen <br>
+     <b>Change the default global AM max-attempts value to be not one</b><br>
+     <blockquote>Today, the global AM max-attempts is set to 1 which is a bad choice. AM max-attempts accounts for both AM level failures as well as container crashes due to localization issue, lost nodes etc. To account for AM crashes due to problems that are not caused by user code, mainly lost nodes, we want to give AMs some retires.
+
+I propose we change it to atleast two. Can change it to 4 to match other retry-configs.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-539">YARN-539</a>.
+     Major sub-task reported by Omkar Vinit Joshi and fixed by Omkar Vinit Joshi <br>
+     <b>LocalizedResources are leaked in memory in case resource localization fails</b><br>
+     <blockquote>If resource localization fails then resource remains in memory and is
+1) Either cleaned up when next time cache cleanup runs and there is space crunch. (If sufficient space in cache is available then it will remain in memory).
+2) reused if LocalizationRequest comes again for the same resource.
+
+I think when resource localization fails then that event should be sent to LocalResourceTracker which will then remove it from its cache.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-538">YARN-538</a>.
+     Major improvement reported by Sandy Ryza and fixed by Sandy Ryza <br>
+     <b>RM address DNS lookup can cause unnecessary slowness on every JHS page load </b><br>
+     <blockquote>When I run the job history server locally, every page load takes in the 10s of seconds.  I profiled the process and discovered that all the extra time was spent inside YarnConfiguration#getRMWebAppURL, trying to resolve 0.0.0.0 to a hostname.  When I changed my yarn.resourcemanager.address to localhost, the page load times decreased drastically.
+
+There's no that we need to perform this resolution on every page load.
+</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-536">YARN-536</a>.
+     Major sub-task reported by Xuan Gong and fixed by Xuan Gong <br>
+     <b>Remove ContainerStatus, ContainerState from Container api interface as they will not be called by the container object</b><br>
+     <blockquote>Remove containerstate, containerStatus from container interface. They will not be called by container object</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-534">YARN-534</a>.
+     Major sub-task reported by Jian He and fixed by Jian He (resourcemanager)<br>
+     <b>AM max attempts is not checked when RM restart and try to recover attempts</b><br>
+     <blockquote>Currently,AM max attempts is only checked if the current attempt fails and check to see whether to create new attempt. If the RM restarts before the max-attempt fails, it'll not clean the state store, when RM comes back, it will retry attempt again.</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-532">YARN-532</a>.
+     Major bug reported by Siddharth Seth and fixed by Siddharth Seth <br>
+     <b>RMAdminProtocolPBClientImpl should implement Closeable</b><br>
+     <blockquote>Required for RPC.stopProxy to work. Already done in most of the other protocols. (MAPREDUCE-5117 addressing the one other protocol missing this)</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-530">YARN-530</a>.
+     Major sub-task reported by Steve Loughran and fixed by Steve Loughran <br>
+     <b>Define Service model strictly, implement AbstractService for robust subclassing, migrate yarn-common services</b><br>
+     <blockquote># Extend the YARN {{Service}} interface as discussed in YARN-117
+# Implement the changes in {{AbstractService}} and {{FilterService}}.
+# Migrate all services in yarn-common to the more robust service model, test.
+
+</blockquote></li>
+<li> <a href="https://issues.apache.org/jira/browse/YARN-525">YARN-525</a>.
+     Major improvement reported by Thomas Graves and fixed by Thomas Graves (capacityscheduler)<br>
+     <b>make CS node-locality-delay refreshable</b><br>
+     <blockquote>the config yarn.scheduler.capacity.node-locality-delay doesn't change when you change the value in capacity_scheduler.xml and then run yarn rmadmin -refreshQueues.</blockquote></li>

[... 2862 lines stripped ...]


Mime
View raw message