hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jl...@apache.org
Subject hadoop git commit: YARN-2632. Document NM Restart feature. Contributed by Junping Du and Vinod Kumar Vavilapalli (cherry picked from commit 1e215e8ba2e801eb26f16c307daee756d6b2ca66)
Date Fri, 07 Nov 2014 23:42:20 GMT
Repository: hadoop
Updated Branches:
  refs/heads/branch-2 a5764cb78 -> 944723552


YARN-2632. Document NM Restart feature. Contributed by Junping Du and Vinod Kumar Vavilapalli
(cherry picked from commit 1e215e8ba2e801eb26f16c307daee756d6b2ca66)


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/94472355
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/94472355
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/94472355

Branch: refs/heads/branch-2
Commit: 9447235527b262605e932065a6f6ac891ec9a338
Parents: a5764cb
Author: Jason Lowe <jlowe@apache.org>
Authored: Fri Nov 7 23:40:22 2014 +0000
Committer: Jason Lowe <jlowe@apache.org>
Committed: Fri Nov 7 23:41:29 2014 +0000

----------------------------------------------------------------------
 hadoop-project/src/site/site.xml                |  1 +
 hadoop-yarn-project/CHANGES.txt                 |  3 +
 .../src/site/apt/NodeManagerRestart.apt.vm      | 86 ++++++++++++++++++++
 3 files changed, 90 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hadoop/blob/94472355/hadoop-project/src/site/site.xml
----------------------------------------------------------------------
diff --git a/hadoop-project/src/site/site.xml b/hadoop-project/src/site/site.xml
index 2fd1532..4a2c221 100644
--- a/hadoop-project/src/site/site.xml
+++ b/hadoop-project/src/site/site.xml
@@ -124,6 +124,7 @@
       <item name="Writing YARN Applications" href="hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html"/>
       <item name="YARN Commands" href="hadoop-yarn/hadoop-yarn-site/YarnCommands.html"/>
       <item name="Scheduler Load Simulator" href="hadoop-sls/SchedulerLoadSimulator.html"/>
+      <item name="NodeManager Restart" href="hadoop-yarn/hadoop-yarn-site/NodeManagerRestart.html"/>
     </menu>
 
     <menu name="YARN REST APIs" inherit="top">

http://git-wip-us.apache.org/repos/asf/hadoop/blob/94472355/hadoop-yarn-project/CHANGES.txt
----------------------------------------------------------------------
diff --git a/hadoop-yarn-project/CHANGES.txt b/hadoop-yarn-project/CHANGES.txt
index 530492f..f3c9d4e 100644
--- a/hadoop-yarn-project/CHANGES.txt
+++ b/hadoop-yarn-project/CHANGES.txt
@@ -168,6 +168,9 @@ Release 2.6.0 - UNRELEASED
     YARN-2647. Added a queue CLI for getting queue information. (Sunil Govind via
     vinodkv)
 
+    YARN-2632. Document NM Restart feature. (Junping Du and Vinod Kumar
+    Vavilapalli via jlowe)
+
   IMPROVEMENTS
 
     YARN-2242. Improve exception information on AM launch crashes. (Li Lu 

http://git-wip-us.apache.org/repos/asf/hadoop/blob/94472355/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerRestart.apt.vm
----------------------------------------------------------------------
diff --git a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerRestart.apt.vm
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerRestart.apt.vm
new file mode 100644
index 0000000..ba03f4e
--- /dev/null
+++ b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/NodeManagerRestart.apt.vm
@@ -0,0 +1,86 @@
+~~ Licensed under the Apache License, Version 2.0 (the "License");
+~~ you may not use this file except in compliance with the License.
+~~ You may obtain a copy of the License at
+~~
+~~   http://www.apache.org/licenses/LICENSE-2.0
+~~
+~~ Unless required by applicable law or agreed to in writing, software
+~~ distributed under the License is distributed on an "AS IS" BASIS,
+~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+~~ See the License for the specific language governing permissions and
+~~ limitations under the License. See accompanying LICENSE file.
+
+  ---
+  NodeManager Restart
+  ---
+  ---
+  ${maven.build.timestamp}
+
+NodeManager Restart
+
+* Introduction
+
+  This document gives an overview of NodeManager (NM) restart, a feature that
+  enables the NodeManager to be restarted without losing 
+  the active containers running on the node. At a high level, the NM stores any 
+  necessary state to a local state-store as it processes container-management
+  requests. When the NM restarts, it recovers by first loading state for
+  various subsystems and then letting those subsystems perform recovery using
+  the loaded state.
+
+* Enabling NM Restart
+
+  [[1]] To enable NM Restart functionality, set the following property in <<conf/yarn-site.xml>>
to true:
+
+*--------------------------------------+--------------------------------------+
+|| Property                            || Value                                |
+*--------------------------------------+--------------------------------------+
+| <<<yarn.nodemanager.recovery.enabled>>> | |
+| | <<<true>>>, (default value is set to false) |
+*--------------------------------------+--------------------------------------+ 
+
+  [[2]] Configure a path to the local file-system directory where the
+  NodeManager can save its run state
+
+*--------------------------------------+--------------------------------------+
+|| Property                            || Description                        |
+*--------------------------------------+--------------------------------------+
+| <<<yarn.nodemanager.recovery.dir>>> | |
+| | The local filesystem directory in which the node manager will store state |
+| | when recovery is enabled.  |
+| | The default value is set to |
+| | <<<${hadoop.tmp.dir}/yarn-nm-recovery>>>. |
+*--------------------------------------+--------------------------------------+ 
+
+  [[3]] Configure a valid RPC address for the NodeManager
+  
+*--------------------------------------+--------------------------------------+
+|| Property                            || Description                        |
+*--------------------------------------+--------------------------------------+
+| <<<yarn.nodemanager.address>>> | |
+| |   Ephemeral ports (port 0, which is default) cannot be used for the |
+| | NodeManager's RPC server specified via yarn.nodemanager.address as it can |
+| | make NM use different ports before and after a restart. This will break any |
+| | previously running clients that were communicating with the NM before |
+| | restart. Explicitly setting yarn.nodemanager.address to an address with |
+| | specific port number (for e.g 0.0.0.0:45454) is a precondition for enabling |
+| | NM restart. |
+*--------------------------------------+--------------------------------------+
+
+  [[4]] Auxiliary services
+  
+  NodeManagers in a YARN cluster can be configured to run auxiliary services.
+  For a completely functional NM restart, YARN relies on any auxiliary service
+  configured to also support recovery. This usually includes (1) avoiding usage
+  of ephemeral ports so that previously running clients (in this case, usually
+  containers) are not disrupted after restart and (2) having the auxiliary
+  service itself support recoverability by reloading any previous state when
+  NodeManager restarts and reinitializes the auxiliary service.
+  
+  A simple example for the above is the auxiliary service 'ShuffleHandler' for
+  MapReduce (MR). ShuffleHandler respects the above two requirements already,
+  so users/admins don't have do anything for it to support NM restart: (1) The
+  configuration property <<mapreduce.shuffle.port>> controls which port the
+  ShuffleHandler on a NodeManager host binds to, and it defaults to a
+  non-ephemeral port. (2) The ShuffleHandler service also already supports
+  recovery of previous state after NM restarts.
\ No newline at end of file


Mime
View raw message