hbase-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dm...@apache.org
Subject svn commit: r1306015 - in /hbase/trunk/src/docbkx: book.xml case_studies.xml configuration.xml performance.xml troubleshooting.xml
Date Tue, 27 Mar 2012 20:50:56 GMT
Author: dmeil
Date: Tue Mar 27 20:50:55 2012
New Revision: 1306015

URL: http://svn.apache.org/viewvc?rev=1306015&view=rev
Log:
hbase-5660.  Creating Case Studies chapter in RefGuide.

Added:
    hbase/trunk/src/docbkx/case_studies.xml   (with props)
Modified:
    hbase/trunk/src/docbkx/book.xml
    hbase/trunk/src/docbkx/configuration.xml
    hbase/trunk/src/docbkx/performance.xml
    hbase/trunk/src/docbkx/troubleshooting.xml

Modified: hbase/trunk/src/docbkx/book.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/book.xml?rev=1306015&r1=1306014&r2=1306015&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/book.xml (original)
+++ hbase/trunk/src/docbkx/book.xml Tue Mar 27 20:50:55 2012
@@ -2396,6 +2396,7 @@ myHtd.setValue(HTableDescriptor.SPLIT_PO
   <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="external_apis.xml" />
   <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="performance.xml" />
   <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="troubleshooting.xml" />
+  <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="case_studies.xml" />
   <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ops_mgt.xml" />
   <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="developer.xml" />
 

Added: hbase/trunk/src/docbkx/case_studies.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/case_studies.xml?rev=1306015&view=auto
==============================================================================
--- hbase/trunk/src/docbkx/case_studies.xml (added)
+++ hbase/trunk/src/docbkx/case_studies.xml Tue Mar 27 20:50:55 2012
@@ -0,0 +1,179 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<chapter version="5.0" xml:id="casestudies"
+         xmlns="http://docbook.org/ns/docbook"
+         xmlns:xlink="http://www.w3.org/1999/xlink"
+         xmlns:xi="http://www.w3.org/2001/XInclude"
+         xmlns:svg="http://www.w3.org/2000/svg"
+         xmlns:m="http://www.w3.org/1998/Math/MathML"
+         xmlns:html="http://www.w3.org/1999/xhtml"
+         xmlns:db="http://docbook.org/ns/docbook">
+<!--
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+-->
+  <title>Case Studies</title>
+    <section xml:id="casestudies.overview">
+      <title>Overview</title>
+      <para>This chapter will describe a variety of performance and troubleshooting
case studies that can 
+      provide a useful blueprint on diagnosing cluster issues.</para>
+      <para>For more information on Performance and Troubleshooting, see <xref linkend="performance"/>
and <xref linkend="trouble"/>.
+      </para>
+    </section>
+   
+    <section xml:id="casestudies.slownode">
+      <title>Case Study #1 (Performance Issue On A Single Node)</title>
+      <section><title>Scenario</title>
+        <para>Following a scheduled reboot, one data node began exhibiting unusual
behavior.  Routine MapReduce 
+         jobs run against HBase tables which regularly completed in five or six minutes began
taking 30 or 40 minutes 
+         to finish. These jobs were consistently found to be waiting on map and reduce tasks
assigned to the troubled data node 
+         (e.g., the slow map tasks all had the same Input Split).           
+         The situation came to a head during a distributed copy, when the copy was severely
prolonged by the lagging node.
+		</para>
+       </section>
+      <section><title>Hardware</title>
+        <para>Datanodes:
+        <itemizedlist>
+          <listitem>Two 12-core processors</listitem>
+          <listitem>Six Enerprise SATA disks</listitem>
+          <listitem>24GB of RAM</listitem>
+          <listitem>Two bonded gigabit NICs</listitem>
+        </itemizedlist>
+        </para>		
+        <para>Network:
+        <itemizedlist>
+          <listitem>10 Gigabit top-of-rack switches</listitem>
+          <listitem>20 Gigabit bonded interconnects between racks.</listitem>
+        </itemizedlist>
+        </para>
+      </section>
+      <section><title>Hypotheses</title>
+		<section><title>HBase "Hot Spot" Region</title>
+		  <para>We hypothesized that we were experiencing a familiar point of pain: a "hot
spot" region in an HBase table, 
+		  where uneven key-space distribution can funnel a huge number of requests to a single
HBase region, bombarding the RegionServer 
+		  process and cause slow response time. Examination of the HBase Master status page showed
that the number of HBase requests to the 
+		  troubled node was almost zero.  Further, examination of the HBase logs showed that there
were no region splits, compactions, or other region transitions 
+		  in progress.  This effectively ruled out a "hot spot" as the root cause of the observed
slowness.
+          </para>		
+        </section>
+		<section><title>HBase Region With Non-Local Data</title>
+		  <para>Our next hypothesis was that one of the MapReduce tasks was requesting data
from HBase that was not local to the datanode, thus 
+		  forcing HDFS to request data blocks from other servers over the network.  Examination
of the datanode logs showed that there were very 
+		  few blocks being requested over the network, indicating that the HBase region was correctly
assigned, and that the majority of the necessary 
+		  data was located on the node. This ruled out the possibility of non-local data causing
a slowdown.
+          </para>
+        </section>		
+		<section><title>Excessive I/O Wait Due To Swapping Or An Over-Worked Or Failing
Hard Disk</title>
+          <para>After concluding that the Hadoop and HBase were not likely to be the
culprits, we moved on to troubleshooting the datanode's hardware. 
+          Java, by design, will periodically scan its entire memory space to do garbage collection.
 If system memory is heavily overcommitted, the Linux 
+          kernel may enter a vicious cycle, using up all of its resources swapping Java heap
back and forth from disk to RAM as Java tries to run garbage 
+          collection.  Further, a failing hard disk will often retry reads and/or writes
many times before giving up and returning an error. This can manifest 
+          as high iowait, as running processes wait for reads and writes to complete.  Finally,
a disk nearing the upper edge of its performance envelope will 
+          begin to cause iowait as it informs the kernel that it cannot accept any more data,
and the kernel queues incoming data into the dirty write pool in memory.  
+          However, using <code>vmstat(1)</code> and <code>free(1)</code>,
we could see that no swap was being used, and the amount of disk IO was only a few kilobytes
per second.
+          </para>		
+        </section>
+		<section><title>Slowness Due To High Processor Usage</title>
+          <para>Next, we checked to see whether the system was performing slowly simply
due to very high computational load.  <code>top(1)</code> showed that the system
load 
+          was higher than normal, but <code>vmstat(1)</code> and <code>mpstat(1)</code>
showed that the amount of processor being used for actual computation was low.
+          </para>	
+        </section>	
+		<section><title>Network Saturation (The Winner)</title>
+          <para>Since neither the disks nor the processors were being utilized heavily,
we moved on to the performance of the network interfaces.  The datanode had two 
+          gigabit ethernet adapters, bonded to form an active-standby interface.  <code>ifconfig(8)</code>
showed some unusual anomalies, namely interface errors, overruns, framing errors. 
+          While not unheard of, these kinds of errors are exceedingly rare on modern hardware
which is operating as it should:
+<programlisting>		
+$ /sbin/ifconfig bond0
+bond0  Link encap:Ethernet  HWaddr 00:00:00:00:00:00  
+inet addr:10.x.x.x  Bcast:10.x.x.255  Mask:255.255.255.0
+UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
+RX packets:2990700159 errors:12 dropped:0 overruns:1 frame:6          &lt;--- Look Here!
Errors!
+TX packets:3443518196 errors:0 dropped:0 overruns:0 carrier:0
+collisions:0 txqueuelen:0 
+RX bytes:2416328868676 (2.4 TB)  TX bytes:3464991094001 (3.4 TB)
+</programlisting>
+          </para>		
+          <para>These errors immediately lead us to suspect that one or more of the
ethernet interfaces might have negotiated the wrong line speed.  This was confirmed both by
running an ICMP ping 
+          from an external host and observing round-trip-time in excess of 700ms, and by
running <code>ethtool(8)</code> on the members of the bond interface and discovering
that the active interface 
+          was operating at 100Mbs/, full duplex.
+<programlisting>		
+$ sudo ethtool eth0
+Settings for eth0:
+Supported ports: [ TP ]
+Supported link modes:   10baseT/Half 10baseT/Full 
+                       100baseT/Half 100baseT/Full 
+                       1000baseT/Full 
+Supports auto-negotiation: Yes
+Advertised link modes:  10baseT/Half 10baseT/Full 
+                       100baseT/Half 100baseT/Full 
+                       1000baseT/Full 
+Advertised pause frame use: No
+Advertised auto-negotiation: Yes
+Link partner advertised link modes:  Not reported
+Link partner advertised pause frame use: No
+Link partner advertised auto-negotiation: No
+Speed: 100Mb/s                                     &lt;--- Look Here!  Should say 1000Mb/s!
+Duplex: Full
+Port: Twisted Pair
+PHYAD: 1
+Transceiver: internal
+Auto-negotiation: on
+MDI-X: Unknown
+Supports Wake-on: umbg
+Wake-on: g
+Current message level: 0x00000003 (3)
+Link detected: yes
+</programlisting>		
+		  </para>
+		  <para>In normal operation, the ICMP ping round trip time should be around 20ms,
and the interface speed and duplex should read, "1000MB/s", and, "Full", respectively.  
+		  </para>
+	    </section>
+     </section>  
+   	<section><title>Resolution</title>
+   	  <para>After determining that the active ethernet adapter was at the incorrect
speed, we used the <code>ifenslave(8)</code> command to make the standby interface

+   	  the active interface, which yielded an immediate improvement in MapReduce performance,
and a 10 times improvement in network throughput:
+	  </para>
+	  <para>On the next trip to the datacenter, we determined that the line speed issue
was ultimately caused by a bad network cable, which was replaced.
+	  </para>
+	</section>
+   </section>  <!--  case study -->
+    <section xml:id="casestudies.perf.1">
+      <title>Case Study #2 (Performance Research 2012)</title>
+      <para>Investigation results of a self-described "we're not sure what's wrong,
but it seems slow" problem. 
+      <link xlink:href="http://gbif.blogspot.com/2012/03/hbase-performance-evaluation-continued.html">http://gbif.blogspot.com/2012/03/hbase-performance-evaluation-continued.html</link>
+      </para>
+    </section>
+
+    <section xml:id="casestudies.perf.2">
+      <title>Case Study #3 (Performance Research 2010))</title>
+      <para>
+      Investigation results of general cluster performance from 2010.  Although this research
is on an older version of the codebase, this writeup
+      is still very useful in terms of approach.
+      <link xlink:href="http://hstack.org/hbase-performance-testing/">http://hstack.org/hbase-performance-testing/</link>
+      </para>
+    </section>
+
+    <section xml:id="casestudies.xceivers">
+      <title>Case Study #4 (xcievers Config)</title>
+      <para>Case study of configuring <code>xceivers</code>, and diagnosing
errors from mis-configurations.
+      <link xlink:href="http://www.larsgeorge.com/2012/03/hadoop-hbase-and-xceivers.html">http://www.larsgeorge.com/2012/03/hadoop-hbase-and-xceivers.html</link>
+      </para>
+      <para>See also <xref linkend="dfs.datanode.max.xcievers"/>.
+      </para>
+    </section>
+	
+  </chapter>

Propchange: hbase/trunk/src/docbkx/case_studies.xml
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Modified: hbase/trunk/src/docbkx/configuration.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/configuration.xml?rev=1306015&r1=1306014&r2=1306015&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/configuration.xml (original)
+++ hbase/trunk/src/docbkx/configuration.xml Tue Mar 27 20:50:55 2012
@@ -335,6 +335,8 @@ to ensure well-formedness of your docume
         java.io.IOException: No live nodes contain current block. Will get new
         block locations from namenode and retry...</code>
         <footnote><para>See <link xlink:href="http://ccgtech.blogspot.com/2010/02/hadoop-hdfs-deceived-by-xciever.html">Hadoop
HDFS: Deceived by Xciever</link> for an informative rant on xceivering.</para></footnote></para>
+       <para>See also <xref linkend="casestudies.xceivers"/>
+       </para>
       </section>
      
      </section>    <!--  hadoop -->

Modified: hbase/trunk/src/docbkx/performance.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/performance.xml?rev=1306015&r1=1306014&r2=1306015&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/performance.xml (original)
+++ hbase/trunk/src/docbkx/performance.xml Tue Mar 27 20:50:55 2012
@@ -96,7 +96,7 @@
     </section>
     <section xml:id="perf.network.ints">
       <title>Network Interfaces</title>
-      <para>Are all the network interfaces functioning correctly?  Are you sure?  See
the Troubleshooting Case Study in <xref linkend="trouble.casestudy"/>.
+      <para>Are all the network interfaces functioning correctly?  Are you sure?  See
the Troubleshooting Case Study in <xref linkend="casestudies.1"/>.
       </para>
     </section>
   </section>  <!-- network -->
@@ -433,7 +433,7 @@ Deferred log flush can be configured on 
         <title>MapReduce - Input Splits</title>
         <para>For MapReduce jobs that use HBase tables as a source, if there a pattern
where the "slow" map tasks seem to 
         have the same Input Split (i.e., the RegionServer serving the data), see the 
-        Troubleshooting Case Study in <xref linkend="trouble.casestudy"/>.
+        Troubleshooting Case Study in <xref linkend="casestudies.1"/>.
         </para>
     </section>
 
@@ -539,4 +539,9 @@ htable.close();</programlisting></para>
     because EC2 issues are practically a separate class of performance issues.
    </para>
   </section>
+  
+  <section xml:id="perf.casestudy"><title>Case Studies</title>
+      <para>For Performance and Troubleshooting Case Studies, see <xref linkend="casestudies"/>.
+      </para>
+  </section>
 </chapter>

Modified: hbase/trunk/src/docbkx/troubleshooting.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/troubleshooting.xml?rev=1306015&r1=1306014&r2=1306015&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/troubleshooting.xml (original)
+++ hbase/trunk/src/docbkx/troubleshooting.xml Tue Mar 27 20:50:55 2012
@@ -1028,122 +1028,9 @@ in your Hadoop's <filename>lib</filename
 </section>
 
     <section xml:id="trouble.casestudy">
-      <title>Case Study</title>
-      <para>The issue described in this case study is somewhat exotic, but the thought
process should 
-      provide a useful blueprint on diagnosing cluster issues.</para>
-      <section><title>Scenario</title>
-        <para>Following a scheduled reboot, one data node began exhibiting unusual
behavior.  Routine MapReduce 
-         jobs run against HBase tables which regularly completed in five or six minutes began
taking 30 or 40 minutes 
-         to finish. These jobs were consistently found to be waiting on map and reduce tasks
assigned to the troubled data node 
-         (e.g., the slow map tasks all had the same Input Split).           
-         The situation came to a head during a distributed copy, when the copy was severely
prolonged by the lagging node.
-		</para>
-       </section>
-      <section><title>Hardware</title>
-        <para>Datanodes:
-        <itemizedlist>
-          <listitem>Two 12-core processors</listitem>
-          <listitem>Six Enerprise SATA disks</listitem>
-          <listitem>24GB of RAM</listitem>
-          <listitem>Two bonded gigabit NICs</listitem>
-        </itemizedlist>
-        </para>		
-        <para>Network:
-        <itemizedlist>
-          <listitem>10 Gigabit top-of-rack switches</listitem>
-          <listitem>20 Gigabit bonded interconnects between racks.</listitem>
-        </itemizedlist>
-        </para>
-      </section>
-      <section><title>Hypotheses</title>
-		<section><title>HBase "Hot Spot" Region</title>
-		  <para>We hypothesized that we were experiencing a familiar point of pain: a "hot
spot" region in an HBase table, 
-		  where uneven key-space distribution can funnel a huge number of requests to a single
HBase region, bombarding the RegionServer 
-		  process and cause slow response time. Examination of the HBase Master status page showed
that the number of HBase requests to the 
-		  troubled node was almost zero.  Further, examination of the HBase logs showed that there
were no region splits, compactions, or other region transitions 
-		  in progress.  This effectively ruled out a "hot spot" as the root cause of the observed
slowness.
-          </para>		
-        </section>
-		<section><title>HBase Region With Non-Local Data</title>
-		  <para>Our next hypothesis was that one of the MapReduce tasks was requesting data
from HBase that was not local to the datanode, thus 
-		  forcing HDFS to request data blocks from other servers over the network.  Examination
of the datanode logs showed that there were very 
-		  few blocks being requested over the network, indicating that the HBase region was correctly
assigned, and that the majority of the necessary 
-		  data was located on the node. This ruled out the possibility of non-local data causing
a slowdown.
-          </para>
-        </section>		
-		<section><title>Excessive I/O Wait Due To Swapping Or An Over-Worked Or Failing
Hard Disk</title>
-          <para>After concluding that the Hadoop and HBase were not likely to be the
culprits, we moved on to troubleshooting the datanode's hardware. 
-          Java, by design, will periodically scan its entire memory space to do garbage collection.
 If system memory is heavily overcommitted, the Linux 
-          kernel may enter a vicious cycle, using up all of its resources swapping Java heap
back and forth from disk to RAM as Java tries to run garbage 
-          collection.  Further, a failing hard disk will often retry reads and/or writes
many times before giving up and returning an error. This can manifest 
-          as high iowait, as running processes wait for reads and writes to complete.  Finally,
a disk nearing the upper edge of its performance envelope will 
-          begin to cause iowait as it informs the kernel that it cannot accept any more data,
and the kernel queues incoming data into the dirty write pool in memory.  
-          However, using <code>vmstat(1)</code> and <code>free(1)</code>,
we could see that no swap was being used, and the amount of disk IO was only a few kilobytes
per second.
-          </para>		
-        </section>
-		<section><title>Slowness Due To High Processor Usage</title>
-          <para>Next, we checked to see whether the system was performing slowly simply
due to very high computational load.  <code>top(1)</code> showed that the system
load 
-          was higher than normal, but <code>vmstat(1)</code> and <code>mpstat(1)</code>
showed that the amount of processor being used for actual computation was low.
-          </para>	
-        </section>	
-		<section><title>Network Saturation (The Winner)</title>
-          <para>Since neither the disks nor the processors were being utilized heavily,
we moved on to the performance of the network interfaces.  The datanode had two 
-          gigabit ethernet adapters, bonded to form an active-standby interface.  <code>ifconfig(8)</code>
showed some unusual anomalies, namely interface errors, overruns, framing errors. 
-          While not unheard of, these kinds of errors are exceedingly rare on modern hardware
which is operating as it should:
-<programlisting>		
-$ /sbin/ifconfig bond0
-bond0  Link encap:Ethernet  HWaddr 00:00:00:00:00:00  
-inet addr:10.x.x.x  Bcast:10.x.x.255  Mask:255.255.255.0
-UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
-RX packets:2990700159 errors:12 dropped:0 overruns:1 frame:6          &lt;--- Look Here!
Errors!
-TX packets:3443518196 errors:0 dropped:0 overruns:0 carrier:0
-collisions:0 txqueuelen:0 
-RX bytes:2416328868676 (2.4 TB)  TX bytes:3464991094001 (3.4 TB)
-</programlisting>
-          </para>		
-          <para>These errors immediately lead us to suspect that one or more of the
ethernet interfaces might have negotiated the wrong line speed.  This was confirmed both by
running an ICMP ping 
-          from an external host and observing round-trip-time in excess of 700ms, and by
running <code>ethtool(8)</code> on the members of the bond interface and discovering
that the active interface 
-          was operating at 100Mbs/, full duplex.
-<programlisting>		
-$ sudo ethtool eth0
-Settings for eth0:
-Supported ports: [ TP ]
-Supported link modes:   10baseT/Half 10baseT/Full 
-                       100baseT/Half 100baseT/Full 
-                       1000baseT/Full 
-Supports auto-negotiation: Yes
-Advertised link modes:  10baseT/Half 10baseT/Full 
-                       100baseT/Half 100baseT/Full 
-                       1000baseT/Full 
-Advertised pause frame use: No
-Advertised auto-negotiation: Yes
-Link partner advertised link modes:  Not reported
-Link partner advertised pause frame use: No
-Link partner advertised auto-negotiation: No
-Speed: 100Mb/s                                     &lt;--- Look Here!  Should say 1000Mb/s!
-Duplex: Full
-Port: Twisted Pair
-PHYAD: 1
-Transceiver: internal
-Auto-negotiation: on
-MDI-X: Unknown
-Supports Wake-on: umbg
-Wake-on: g
-Current message level: 0x00000003 (3)
-Link detected: yes
-</programlisting>		
-		  </para>
-		  <para>In normal operation, the ICMP ping round trip time should be around 20ms,
and the interface speed and duplex should read, "1000MB/s", and, "Full", respectively.  
-		  </para>
-	    </section>
-     </section>  
-   	<section><title>Resolution</title>
-   	  <para>After determining that the active ethernet adapter was at the incorrect
speed, we used the <code>ifenslave(8)</code> command to make the standby interface

-   	  the active interface, which yielded an immediate improvement in MapReduce performance,
and a 10 times improvement in network throughput:
-	  </para>
-	  <para>On the next trip to the datacenter, we determined that the line speed issue
was ultimately caused by a bad network cable, which was replaced.
-	  </para>
-	</section>
-   </section>  <!--  case study -->
+      <title>Case Studies</title>
+      <para>For Performance and Troubleshooting Case Studies, see <xref linkend="casestudies"/>.
+      </para>
+    </section>
 	
   </chapter>



Mime
View raw message