hbase-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From st...@apache.org
Subject svn commit: r1417657 - /hbase/trunk/src/docbkx/ops_mgt.xml
Date Wed, 05 Dec 2012 21:31:58 GMT
Author: stack
Date: Wed Dec  5 21:31:57 2012
New Revision: 1417657

URL: http://svn.apache.org/viewvc?rev=1417657&view=rev
More on bad disk handling


Modified: hbase/trunk/src/docbkx/ops_mgt.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/ops_mgt.xml?rev=1417657&r1=1417656&r2=1417657&view=diff
--- hbase/trunk/src/docbkx/ops_mgt.xml (original)
+++ hbase/trunk/src/docbkx/ops_mgt.xml Wed Dec  5 21:31:57 2012
@@ -387,11 +387,18 @@ false
             to go down spewing errors in <filename>dmesg</filename> -- or for
some reason, run much slower than their
             companions.  In this case you want to decommission the disk.  You have two options.
 You can
             <xlink href="http://wiki.apache.org/hadoop/FAQ#I_want_to_make_a_large_cluster_smaller_by_taking_out_a_bunch_of_nodes_simultaneously._How_can_this_be_done.3F">decommission
the datanode</xlink>
-            or, less disruptive in that only the bad disks data will be rereplicated, is
that you can stop the datanode,
+            or, less disruptive in that only the bad disks data will be rereplicated, can
stop the datanode,
             unmount the bad volume (You can't umount a volume while the datanode is using
it), and then restart the
             datanode (presuming you have set dfs.datanode.failed.volumes.tolerated > 0).
 The regionserver will
             throw some errors in its logs as it recalibrates where to get its data from --
it will likely
             roll its WAL log too -- but in general but for some latency spikes, it should
keep on chugging.
+            <note>
+                <para>If you are doing short-circuit reads, you will have to move the
regions off the regionserver
+                    before you stop the datanode; when short-circuiting reading, though chmod'd
so regionserver cannot
+                    have access, because it already has the files open, it will be able to
keep reading the file blocks
+                    from the bad disk even though the datanode is down.  Move the regions
back after you restart the
+                datanode.</para>
+            </note>

View raw message