hbase-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jmhs...@apache.org
Subject git commit: HBASE-10202 Documentation is lacking information about rolling-restart.sh script (Misty Stanley-Jones)
Date Mon, 18 Aug 2014 22:33:03 GMT
Repository: hbase
Updated Branches:
  refs/heads/branch-1 cbd21dd9c -> fb40c3834


HBASE-10202 Documentation is lacking information about rolling-restart.sh script (Misty Stanley-Jones)


Project: http://git-wip-us.apache.org/repos/asf/hbase/repo
Commit: http://git-wip-us.apache.org/repos/asf/hbase/commit/fb40c383
Tree: http://git-wip-us.apache.org/repos/asf/hbase/tree/fb40c383
Diff: http://git-wip-us.apache.org/repos/asf/hbase/diff/fb40c383

Branch: refs/heads/branch-1
Commit: fb40c383453b5de296fd892d601827eb14a48cda
Parents: cbd21dd
Author: Jonathan M Hsieh <jmhsieh@apache.org>
Authored: Mon Aug 18 15:30:01 2014 -0700
Committer: Jonathan M Hsieh <jmhsieh@apache.org>
Committed: Mon Aug 18 15:31:47 2014 -0700

----------------------------------------------------------------------
 src/main/docbkx/ops_mgt.xml | 174 +++++++++++++++++++++++++++++----------
 1 file changed, 132 insertions(+), 42 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hbase/blob/fb40c383/src/main/docbkx/ops_mgt.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/ops_mgt.xml b/src/main/docbkx/ops_mgt.xml
index 841ad89..8c04c9a 100644
--- a/src/main/docbkx/ops_mgt.xml
+++ b/src/main/docbkx/ops_mgt.xml
@@ -793,48 +793,138 @@ false
     <section
       xml:id="rolling">
       <title>Rolling Restart</title>
-      <para> You can also ask this script to restart a RegionServer after the shutdown
AND move its
-        old regions back into place. The latter you might do to retain data locality. A primitive
-        rolling restart might be effected by running something like the following:</para>
-      <screen>$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart
--reload --debug $i; done &amp;> /tmp/log.txt &amp;</screen>
-      <para> Tail the output of <filename>/tmp/log.txt</filename> to follow
the scripts progress.
-        The above does RegionServers only. The script will also disable the load balancer
before
-        moving the regions. You'd need to do the master update separately. Do it before you
run the
-        above script. Here is a pseudo-script for how you might craft a rolling restart script:
</para>
-      <orderedlist>
-        <listitem>
-          <para>Untar your release, make sure of its configuration and then rsync it
across the
-            cluster. If this is 0.90.2, patch it with HBASE-3744 and HBASE-3756. </para>
-        </listitem>
-        <listitem>
-          <para>Run hbck to ensure the cluster consistent
-            <programlisting>$ ./bin/hbase hbck</programlisting> Effect repairs
if inconsistent.
-          </para>
-        </listitem>
-        <listitem>
-          <para>Restart the Master:
-            <programlisting>$ ./bin/hbase-daemon.sh stop master; ./bin/hbase-daemon.sh
start master</programlisting>
-          </para>
-        </listitem>
-        <listitem>
-          <para>Run the <filename>graceful_stop.sh</filename> script per
RegionServer. For
-            example:</para>
-          <programlisting>$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh
--restart --reload --debug $i; done &amp;> /tmp/log.txt &amp;
-            </programlisting>
-          <para> If you are running thrift or rest servers on the RegionServer, pass
--thrift or
-            --rest options (See usage for <filename>graceful_stop.sh</filename>
script). </para>
-        </listitem>
-        <listitem>
-          <para>Restart the Master again. This will clear out dead servers list and
reenable the
-            balancer. </para>
-        </listitem>
-        <listitem>
-          <para>Run hbck to ensure the cluster is consistent. </para>
-        </listitem>
-      </orderedlist>
-      <para>It is important to drain HBase regions slowly when restarting regionservers.
Otherwise,
-        multiple regions go offline simultaneously as they are re-assigned to other nodes.
Depending
-        on your usage patterns, this might not be desirable. </para>
+
+      <para>Some cluster configuration changes require either the entire cluster, or
the
+          RegionServers, to be restarted in order to pick up the changes. In addition, rolling
+          restarts are supported for upgrading to a minor or maintenance release, and to
a major
+          release if at all possible. See the release notes for release you want to upgrade
to, to
+          find out about limitations to the ability to perform a rolling upgrade.</para>
+      <para>There are multiple ways to restart your cluster nodes, depending on your
situation.
+        These methods are detailed below.</para>
+      <section>
+        <title>Using the <command>rolling-restart.sh</command> Script</title>
+        
+        <para>HBase ships with a script, <filename>bin/rolling-restart.sh</filename>,
that allows
+          you to perform rolling restarts on the entire cluster, the master only, or the
+          RegionServers only. The script is provided as a template for your own script, and
is not
+          explicitly tested. It requires password-less SSH login to be configured and assumes
that
+          you have deployed using a tarball. The script requires you to set some environment
+          variables before running it. Examine the script and modify it to suit your needs.</para>
+        <example>
+          <title><filename>rolling-restart.sh</filename> General Usage</title>
+          <screen language="bourne">
+$ <userinput>./bin/rolling-restart.sh --help</userinput><![CDATA[
+Usage: rolling-restart.sh [--config <hbase-confdir>] [--rs-only] [--master-only] [--graceful]
[--maxthreads xx]          
+        ]]></screen>
+        </example>
+        <variablelist>
+          <varlistentry>
+            <term>Rolling Restart on RegionServers Only</term>
+            <listitem>
+              <para>To perform a rolling restart on the RegionServers only, use the
+                  <code>--rs-only</code> option. This might be necessary if you
need to reboot the
+                individual RegionServer or if you make a configuration change that only affects
+                RegionServers and not the other HBase processes.</para>
+              <para>If you need to restart only a single RegionServer, or if you need
to do extra
+                actions during the restart, use the <filename>bin/graceful_stop.sh</filename>
+                command instead. See <xref linkend="rolling.restart.manual" />.</para>
+            </listitem>
+          </varlistentry>
+          <varlistentry>
+            <term>Rolling Restart on Masters Only</term>
+            <listitem>
+              <para>To perform a rolling restart on the active and backup Masters,
use the
+                  <code>--master-only</code> option. You might use this if you
know that your
+                configuration change only affects the Master and not the RegionServers, or
if you
+                need to restart the server where the active Master is running.</para>
+              <para>If you are not running backup Masters, the Master is simply restarted.
If you
+                are running backup Masters, they are all stopped before any are restarted,
to avoid
+                a race condition in ZooKeeper to determine which is the new Master. First
the main
+                Master is restarted, then the backup Masters are restarted. Directly after
restart,
+                it checks for and cleans out any regions in transition before taking on its
normal
+                workload.</para>
+            </listitem>
+          </varlistentry>
+          <varlistentry>
+            <term>Graceful Restart</term>
+            <listitem>
+              <para>If you specify the <code>--graceful</code> option,
RegionServers are restarted
+                using the <filename>bin/graceful_stop.sh</filename> script, which
moves regions off
+                a RegionServer before restarting it. This is safer, but can delay the
+                restart.</para>
+            </listitem>
+          </varlistentry>
+          <varlistentry>
+            <term>Limiting the Number of Threads</term>
+            <listitem>
+              <para>To limit the rolling restart to using only a specific number of
threads, use the
+                  <code>--maxthreads</code> option.</para>
+            </listitem>
+          </varlistentry>
+        </variablelist>
+      </section>
+      <section xml:id="rolling.restart.manual">
+        <title>Manual Rolling Restart</title>
+        <para>To retain more control over the process, you may wish to manually do
a rolling restart
+          across your cluster. This uses the <command>graceful-stop.sh</command>
command <xref
+            linkend="decommission" />. In this method, you can restart each RegionServer
+          individually and then move its old regions back into place, retaining locality.
If you
+          also need to restart the Master, you need to do it separately, and restart the
Master
+          before restarting the RegionServers using this method. The following is an example
of such
+          a command. You may need to tailor it to your environment. This script does a rolling
+          restart of RegionServers only. It disables the load balancer before moving the
+          regions.</para>
+        <screen><![CDATA[
+$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug
$i; done &> /tmp/log.txt &;     
+        ]]></screen>
+        <para>Monitor the output of the <filename>/tmp/log.txt</filename>
file to follow the
+          progress of the script. </para>
+      </section>
+
+      <section>
+        <title>Logic for Crafting Your Own Rolling Restart Script</title>
+        <para>Use the following guidelines if you want to create your own rolling restart
script.</para>
+        <orderedlist>
+          <listitem>
+            <para>Extract the new release, verify its configuration, and synchronize
it to all nodes
+              of your cluster using <command>rsync</command>, <command>scp</command>,
or another
+              secure synchronization mechanism.</para></listitem>
+          <listitem><para>Use the hbck utility to ensure that the cluster is
consistent.</para>
+          <screen>
+$ ./bin/hbck            
+          </screen>
+            <para>Perform repairs if required. See <xref linkend="hbck" /> for
details.</para>
+          </listitem>
+          <listitem><para>Restart the master first. You may need to modify these
commands if your
+            new HBase directory is different from the old one, such as for an upgrade.</para>
+          <screen>
+$ ./bin/hbase-daemon.sh stop master; ./bin/hbase-daemon.sh start master            
+          </screen>
+          </listitem>
+          <listitem><para>Gracefully restart each RegionServer, using a script
such as the
+            following, from the Master.</para>
+          <screen><![CDATA[
+$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug
$i; done &> /tmp/log.txt &            
+          ]]></screen>
+            <para>If you are running Thrift or REST servers, pass the --thrift or --rest
options.
+              For other available options, run the <command>bin/graceful-stop.sh --help</command>
+              command.</para>
+            <para>It is important to drain HBase regions slowly when restarting multiple
+              RegionServers. Otherwise, multiple regions go offline simultaneously and must
be
+              reassigned to other nodes, which may also go offline soon. This can negatively
affect
+              performance. You can inject delays into the script above, for instance, by
adding a
+              Shell command such as <command>sleep</command>. To wait for 5 minutes
between each
+              RegionServer restart, modify the above script to the following:</para>
+            <screen><![CDATA[
+$ for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug
$i & sleep 5m; done &> /tmp/log.txt &            
+          ]]></screen>
+          </listitem>
+          <listitem><para>Restart the Master again, to clear out the dead servers
list and re-enable
+          the load balancer.</para></listitem>
+          <listitem><para>Run the <command>hbck</command> utility
again, to be sure the cluster is
+            consistent.</para></listitem>
+        </orderedlist>
+      </section>
     </section>
     <section
       xml:id="adding.new.node">


Mime
View raw message