hbase-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From st...@apache.org
Subject [08/14] HBASE-11199 One-time effort to pretty-print the Docbook XML, to make further patch review easier (Misty Stanley-Jones)
Date Wed, 28 May 2014 14:59:06 GMT
http://git-wip-us.apache.org/repos/asf/hbase/blob/63e8304e/src/main/docbkx/performance.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/performance.xml b/src/main/docbkx/performance.xml
index a56c522..cbe600e 100644
--- a/src/main/docbkx/performance.xml
+++ b/src/main/docbkx/performance.xml
@@ -1,13 +1,15 @@
 <?xml version="1.0" encoding="UTF-8"?>
-<chapter version="5.0" xml:id="performance"
-         xmlns="http://docbook.org/ns/docbook"
-         xmlns:xlink="http://www.w3.org/1999/xlink"
-         xmlns:xi="http://www.w3.org/2001/XInclude"
-         xmlns:svg="http://www.w3.org/2000/svg"
-         xmlns:m="http://www.w3.org/1998/Math/MathML"
-         xmlns:html="http://www.w3.org/1999/xhtml"
-         xmlns:db="http://docbook.org/ns/docbook">
-<!--
+<chapter
+  version="5.0"
+  xml:id="performance"
+  xmlns="http://docbook.org/ns/docbook"
+  xmlns:xlink="http://www.w3.org/1999/xlink"
+  xmlns:xi="http://www.w3.org/2001/XInclude"
+  xmlns:svg="http://www.w3.org/2000/svg"
+  xmlns:m="http://www.w3.org/1998/Math/MathML"
+  xmlns:html="http://www.w3.org/1999/xhtml"
+  xmlns:db="http://docbook.org/ns/docbook">
+  <!--
 /**
  * Licensed to the Apache Software Foundation (ASF) under one
  * or more contributor license agreements.  See the NOTICE file
@@ -28,186 +30,232 @@
 -->
   <title>Apache HBase Performance Tuning</title>
 
-  <section xml:id="perf.os">
+  <section
+    xml:id="perf.os">
     <title>Operating System</title>
-        <section xml:id="perf.os.ram">
-          <title>Memory</title>
-          <para>RAM, RAM, RAM.  Don't starve HBase.</para>
-        </section>
-        <section xml:id="perf.os.64">
-          <title>64-bit</title>
-          <para>Use a 64-bit platform (and 64-bit JVM).</para>
-        </section>
-        <section xml:id="perf.os.swap">
-          <title>Swapping</title>
-          <para>Watch out for swapping.  Set swappiness to 0.</para>
-        </section>
+    <section
+      xml:id="perf.os.ram">
+      <title>Memory</title>
+      <para>RAM, RAM, RAM. Don't starve HBase.</para>
+    </section>
+    <section
+      xml:id="perf.os.64">
+      <title>64-bit</title>
+      <para>Use a 64-bit platform (and 64-bit JVM).</para>
+    </section>
+    <section
+      xml:id="perf.os.swap">
+      <title>Swapping</title>
+      <para>Watch out for swapping. Set swappiness to 0.</para>
+    </section>
   </section>
-  <section xml:id="perf.network">
+  <section
+    xml:id="perf.network">
     <title>Network</title>
     <para> Perhaps the most important factor in avoiding network issues degrading Hadoop and HBase
       performance is the switching hardware that is used, decisions made early in the scope of the
       project can cause major problems when you double or triple the size of your cluster (or more). </para>
-    <para>
-    Important items to consider:
-        <itemizedlist>
-          <listitem><para>Switching capacity of the device</para></listitem>
-          <listitem><para>Number of systems connected</para></listitem>
-          <listitem><para>Uplink capacity</para></listitem>
-        </itemizedlist>
+    <para> Important items to consider: <itemizedlist>
+        <listitem>
+          <para>Switching capacity of the device</para>
+        </listitem>
+        <listitem>
+          <para>Number of systems connected</para>
+        </listitem>
+        <listitem>
+          <para>Uplink capacity</para>
+        </listitem>
+      </itemizedlist>
     </para>
-    <section xml:id="perf.network.1switch">
+    <section
+      xml:id="perf.network.1switch">
       <title>Single Switch</title>
-      <para>The single most important factor in this configuration is that the switching capacity of the hardware is capable of
-      handling the traffic which can be generated by all systems connected to the switch. Some lower priced commodity hardware
-      can have a slower switching capacity than could be utilized by a full switch.
-      </para>
+      <para>The single most important factor in this configuration is that the switching capacity of
+        the hardware is capable of handling the traffic which can be generated by all systems
+        connected to the switch. Some lower priced commodity hardware can have a slower switching
+        capacity than could be utilized by a full switch. </para>
     </section>
-    <section xml:id="perf.network.2switch">
+    <section
+      xml:id="perf.network.2switch">
       <title>Multiple Switches</title>
-      <para>Multiple switches are a potential pitfall in the architecture.   The most common configuration of lower priced hardware is a
-      simple 1Gbps uplink from one switch to another. This often overlooked pinch point can easily become a bottleneck for cluster communication.
-      Especially with MapReduce jobs that are both reading and writing a lot of data the communication across this uplink could be saturated.
-      </para>
-      <para>Mitigation of this issue is fairly simple and can be accomplished in multiple ways:
+      <para>Multiple switches are a potential pitfall in the architecture. The most common
+        configuration of lower priced hardware is a simple 1Gbps uplink from one switch to another.
+        This often overlooked pinch point can easily become a bottleneck for cluster communication.
+        Especially with MapReduce jobs that are both reading and writing a lot of data the
+        communication across this uplink could be saturated. </para>
+      <para>Mitigation of this issue is fairly simple and can be accomplished in multiple ways: </para>
       <itemizedlist>
-        <listitem><para>Use appropriate hardware for the scale of the cluster which you're attempting to build.</para></listitem>
-        <listitem><para>Use larger single switch configurations i.e. single 48 port as opposed to 2x 24 port</para></listitem>
-        <listitem><para>Configure port trunking for uplinks to utilize multiple interfaces to increase cross switch bandwidth.</para></listitem>
+        <listitem>
+          <para>Use appropriate hardware for the scale of the cluster which you're attempting to
+            build.</para>
+        </listitem>
+        <listitem>
+          <para>Use larger single switch configurations i.e. single 48 port as opposed to 2x 24
+            port</para>
+        </listitem>
+        <listitem>
+          <para>Configure port trunking for uplinks to utilize multiple interfaces to increase cross
+            switch bandwidth.</para>
+        </listitem>
       </itemizedlist>
-      </para>
     </section>
-    <section xml:id="perf.network.multirack">
+    <section
+      xml:id="perf.network.multirack">
       <title>Multiple Racks</title>
-      <para>Multiple rack configurations carry the same potential issues as multiple switches, and can suffer performance degradation from two main areas:
-         <itemizedlist>
-           <listitem><para>Poor switch capacity performance</para></listitem>
-           <listitem><para>Insufficient uplink to another rack</para></listitem>
-         </itemizedlist>
-      If the the switches in your rack have appropriate switching capacity to handle all the hosts at full speed, the next most likely issue will be caused by homing
-      more of your cluster across racks.  The easiest way to avoid issues when spanning multiple racks is to use port trunking to create a bonded uplink to other racks.
-      The downside of this method however, is in the overhead of ports that could potentially be used. An example of this is, creating an 8Gbps port channel from rack
-      A to rack B, using 8 of your 24 ports to communicate between racks gives you a poor ROI, using too few however can mean you're not getting the most out of your cluster.
-      </para>
-      <para>Using 10Gbe links between racks will greatly increase performance, and assuming your switches support a 10Gbe uplink or allow for an expansion card will allow you to
-      save your ports for machines as opposed to uplinks.
-      </para>
-    </section>
-    <section xml:id="perf.network.ints">
+      <para>Multiple rack configurations carry the same potential issues as multiple switches, and
+        can suffer performance degradation from two main areas: </para>
+      <itemizedlist>
+        <listitem>
+          <para>Poor switch capacity performance</para>
+        </listitem>
+        <listitem>
+          <para>Insufficient uplink to another rack</para>
+        </listitem>
+      </itemizedlist>
+      <para>If the the switches in your rack have appropriate switching capacity to handle all the
+        hosts at full speed, the next most likely issue will be caused by homing more of your
+        cluster across racks. The easiest way to avoid issues when spanning multiple racks is to use
+        port trunking to create a bonded uplink to other racks. The downside of this method however,
+        is in the overhead of ports that could potentially be used. An example of this is, creating
+        an 8Gbps port channel from rack A to rack B, using 8 of your 24 ports to communicate between
+        racks gives you a poor ROI, using too few however can mean you're not getting the most out
+        of your cluster. </para>
+      <para>Using 10Gbe links between racks will greatly increase performance, and assuming your
+        switches support a 10Gbe uplink or allow for an expansion card will allow you to save your
+        ports for machines as opposed to uplinks. </para>
+    </section>
+    <section
+      xml:id="perf.network.ints">
       <title>Network Interfaces</title>
-      <para>Are all the network interfaces functioning correctly?  Are you sure?  See the Troubleshooting Case Study in <xref linkend="casestudies.slownode"/>.
-      </para>
+      <para>Are all the network interfaces functioning correctly? Are you sure? See the
+        Troubleshooting Case Study in <xref
+          linkend="casestudies.slownode" />. </para>
     </section>
-  </section>  <!-- network -->
+  </section>
+  <!-- network -->
 
-  <section xml:id="jvm">
+  <section
+    xml:id="jvm">
     <title>Java</title>
 
-    <section xml:id="gc">
+    <section
+      xml:id="gc">
       <title>The Garbage Collector and Apache HBase</title>
 
-      <section xml:id="gcpause">
+      <section
+        xml:id="gcpause">
         <title>Long GC pauses</title>
 
-        <para xml:id="mslab">In his presentation, <link
-        xlink:href="http://www.slideshare.net/cloudera/hbase-hug-presentation">Avoiding
-        Full GCs with MemStore-Local Allocation Buffers</link>, Todd Lipcon
-        describes two cases of stop-the-world garbage collections common in
-        HBase, especially during loading; CMS failure modes and old generation
-        heap fragmentation brought. To address the first, start the CMS
-        earlier than default by adding
-        <code>-XX:CMSInitiatingOccupancyFraction</code> and setting it down
-        from defaults. Start at 60 or 70 percent (The lower you bring down the
-        threshold, the more GCing is done, the more CPU used). To address the
-        second fragmentation issue, Todd added an experimental facility,
-        <indexterm><primary>MSLAB</primary></indexterm>, that
-        must be explicitly enabled in Apache HBase 0.90.x (Its defaulted to be on in
-        Apache 0.92.x HBase). See <code>hbase.hregion.memstore.mslab.enabled</code>
-        to true in your <classname>Configuration</classname>. See the cited
-        slides for background and detail<footnote><para>The latest jvms do better
-        regards fragmentation so make sure you are running a recent release.
-        Read down in the message,
-        <link xlink:href="http://osdir.com/ml/hotspot-gc-use/2011-11/msg00002.html">Identifying concurrent mode failures caused by fragmentation</link>.</para></footnote>.
-        Be aware that when enabled, each MemStore instance will occupy at least
-        an MSLAB instance of memory.  If you have thousands of regions or lots
-        of regions each with many column families, this allocation of MSLAB
-        may be responsible for a good portion of your heap allocation and in
-        an extreme case cause you to OOME.  Disable MSLAB in this case, or
-        lower the amount of memory it uses or float less regions per server.
-        </para>
-        <para>If you have a write-heavy workload, check out
-            <link xlink:href="https://issues.apache.org/jira/browse/HBASE-8163">HBASE-8163 MemStoreChunkPool: An improvement for JAVA GC when using MSLAB</link>.
-            It describes configurations to lower the amount of young GC during write-heavy loadings.  If you do not have HBASE-8163 installed, and you are
-            trying to improve your young GC times, one trick to consider -- courtesy of our Liang Xie -- is to set the GC config <varname>-XX:PretenureSizeThreshold</varname> in <filename>hbase-env.sh</filename>
-            to be just smaller than the size of <varname>hbase.hregion.memstore.mslab.chunksize</varname> so MSLAB allocations happen in the
-            tenured space directly rather than first in the young gen.  You'd do this because these MSLAB allocations are going to likely make it
-            to the old gen anyways and rather than pay the price of a copies between s0 and s1 in eden space followed by the copy up from
-            young to old gen after the MSLABs have achieved sufficient tenure, save a bit of YGC churn and allocate in the old gen directly.
-            </para>
-        <para>For more information about GC logs, see <xref linkend="trouble.log.gc" />.
-        </para>
+        <para
+          xml:id="mslab">In his presentation, <link
+            xlink:href="http://www.slideshare.net/cloudera/hbase-hug-presentation">Avoiding Full GCs
+            with MemStore-Local Allocation Buffers</link>, Todd Lipcon describes two cases of
+          stop-the-world garbage collections common in HBase, especially during loading; CMS failure
+          modes and old generation heap fragmentation brought. To address the first, start the CMS
+          earlier than default by adding <code>-XX:CMSInitiatingOccupancyFraction</code> and setting
+          it down from defaults. Start at 60 or 70 percent (The lower you bring down the threshold,
+          the more GCing is done, the more CPU used). To address the second fragmentation issue,
+          Todd added an experimental facility, <indexterm><primary>MSLAB</primary></indexterm>, that
+          must be explicitly enabled in Apache HBase 0.90.x (Its defaulted to be on in Apache 0.92.x
+          HBase). See <code>hbase.hregion.memstore.mslab.enabled</code> to true in your
+            <classname>Configuration</classname>. See the cited slides for background and detail<footnote>
+            <para>The latest jvms do better regards fragmentation so make sure you are running a
+              recent release. Read down in the message, <link
+                xlink:href="http://osdir.com/ml/hotspot-gc-use/2011-11/msg00002.html">Identifying
+                concurrent mode failures caused by fragmentation</link>.</para>
+          </footnote>. Be aware that when enabled, each MemStore instance will occupy at least an
+          MSLAB instance of memory. If you have thousands of regions or lots of regions each with
+          many column families, this allocation of MSLAB may be responsible for a good portion of
+          your heap allocation and in an extreme case cause you to OOME. Disable MSLAB in this case,
+          or lower the amount of memory it uses or float less regions per server. </para>
+        <para>If you have a write-heavy workload, check out <link
+            xlink:href="https://issues.apache.org/jira/browse/HBASE-8163">HBASE-8163
+            MemStoreChunkPool: An improvement for JAVA GC when using MSLAB</link>. It describes
+          configurations to lower the amount of young GC during write-heavy loadings. If you do not
+          have HBASE-8163 installed, and you are trying to improve your young GC times, one trick to
+          consider -- courtesy of our Liang Xie -- is to set the GC config
+            <varname>-XX:PretenureSizeThreshold</varname> in <filename>hbase-env.sh</filename> to be
+          just smaller than the size of <varname>hbase.hregion.memstore.mslab.chunksize</varname> so
+          MSLAB allocations happen in the tenured space directly rather than first in the young gen.
+          You'd do this because these MSLAB allocations are going to likely make it to the old gen
+          anyways and rather than pay the price of a copies between s0 and s1 in eden space followed
+          by the copy up from young to old gen after the MSLABs have achieved sufficient tenure,
+          save a bit of YGC churn and allocate in the old gen directly. </para>
+        <para>For more information about GC logs, see <xref
+            linkend="trouble.log.gc" />. </para>
       </section>
     </section>
   </section>
 
-  <section xml:id="perf.configurations">
+  <section
+    xml:id="perf.configurations">
     <title>HBase Configurations</title>
 
-    <para>See <xref linkend="recommended_configurations" />.</para>
+    <para>See <xref
+        linkend="recommended_configurations" />.</para>
 
-    <section xml:id="perf.compactions.and.splits">
+    <section
+      xml:id="perf.compactions.and.splits">
       <title>Managing Compactions</title>
 
       <para>For larger systems, managing <link
-      linkend="disable.splitting">compactions and splits</link> may be
-      something you want to consider.</para>
-    </section>
-
-    <section xml:id="perf.handlers">
-        <title><varname>hbase.regionserver.handler.count</varname></title>
-        <para>See <xref linkend="hbase.regionserver.handler.count"/>.
-	    </para>
-    </section>
-    <section xml:id="perf.hfile.block.cache.size">
-        <title><varname>hfile.block.cache.size</varname></title>
-        <para>See <xref linkend="hfile.block.cache.size"/>.
-        A memory setting for the RegionServer process.
-        </para>
-    </section>
-    <section xml:id="perf.rs.memstore.size">
-        <title><varname>hbase.regionserver.global.memstore.size</varname></title>
-        <para>See <xref linkend="hbase.regionserver.global.memstore.size"/>.
-        This memory setting is often adjusted for the RegionServer process depending on needs.
-        </para>
-    </section>
-    <section xml:id="perf.rs.memstore.size.lower.limit">
-        <title><varname>hbase.regionserver.global.memstore.size.lower.limit</varname></title>
-        <para>See <xref linkend="hbase.regionserver.global.memstore.size.lower.limit"/>.
-        This memory setting is often adjusted for the RegionServer process depending on needs.
-        </para>
-    </section>
-    <section xml:id="perf.hstore.blockingstorefiles">
-        <title><varname>hbase.hstore.blockingStoreFiles</varname></title>
-        <para>See <xref linkend="hbase.hstore.blockingStoreFiles"/>.
-        If there is blocking in the RegionServer logs, increasing this can help.
-        </para>
-    </section>
-    <section xml:id="perf.hregion.memstore.block.multiplier">
-        <title><varname>hbase.hregion.memstore.block.multiplier</varname></title>
-        <para>See <xref linkend="hbase.hregion.memstore.block.multiplier"/>.
-        If there is enough RAM, increasing this can help.
-        </para>
-    </section>
-    <section xml:id="hbase.regionserver.checksum.verify">
-        <title><varname>hbase.regionserver.checksum.verify</varname></title>
-        <para>Have HBase write the checksum into the datablock and save
-        having to do the checksum seek whenever you read.</para>
-
-        <para>See <xref linkend="hbase.regionserver.checksum.verify"/>,
-        <xref linkend="hbase.hstore.bytes.per.checksum"/> and <xref linkend="hbase.hstore.checksum.algorithm"/>
-        For more information see the
-        release note on <link xlink:href="https://issues.apache.org/jira/browse/HBASE-5074">HBASE-5074 support checksums in HBase block cache</link>.
-        </para>
+          linkend="disable.splitting">compactions and splits</link> may be something you want to
+        consider.</para>
+    </section>
+
+    <section
+      xml:id="perf.handlers">
+      <title><varname>hbase.regionserver.handler.count</varname></title>
+      <para>See <xref
+          linkend="hbase.regionserver.handler.count" />. </para>
+    </section>
+    <section
+      xml:id="perf.hfile.block.cache.size">
+      <title><varname>hfile.block.cache.size</varname></title>
+      <para>See <xref
+          linkend="hfile.block.cache.size" />. A memory setting for the RegionServer process.
+      </para>
+    </section>
+    <section
+      xml:id="perf.rs.memstore.size">
+      <title><varname>hbase.regionserver.global.memstore.size</varname></title>
+      <para>See <xref
+          linkend="hbase.regionserver.global.memstore.size" />. This memory setting is often
+        adjusted for the RegionServer process depending on needs. </para>
+    </section>
+    <section
+      xml:id="perf.rs.memstore.size.lower.limit">
+      <title><varname>hbase.regionserver.global.memstore.size.lower.limit</varname></title>
+      <para>See <xref
+          linkend="hbase.regionserver.global.memstore.size.lower.limit" />. This memory setting is
+        often adjusted for the RegionServer process depending on needs. </para>
+    </section>
+    <section
+      xml:id="perf.hstore.blockingstorefiles">
+      <title><varname>hbase.hstore.blockingStoreFiles</varname></title>
+      <para>See <xref
+          linkend="hbase.hstore.blockingStoreFiles" />. If there is blocking in the RegionServer
+        logs, increasing this can help. </para>
+    </section>
+    <section
+      xml:id="perf.hregion.memstore.block.multiplier">
+      <title><varname>hbase.hregion.memstore.block.multiplier</varname></title>
+      <para>See <xref
+          linkend="hbase.hregion.memstore.block.multiplier" />. If there is enough RAM, increasing
+        this can help. </para>
+    </section>
+    <section
+      xml:id="hbase.regionserver.checksum.verify">
+      <title><varname>hbase.regionserver.checksum.verify</varname></title>
+      <para>Have HBase write the checksum into the datablock and save having to do the checksum seek
+        whenever you read.</para>
+
+      <para>See <xref
+          linkend="hbase.regionserver.checksum.verify" />, <xref
+          linkend="hbase.hstore.bytes.per.checksum" /> and <xref
+          linkend="hbase.hstore.checksum.algorithm" /> For more information see the release note on <link
+          xlink:href="https://issues.apache.org/jira/browse/HBASE-5074">HBASE-5074 support checksums
+          in HBase block cache</link>. </para>
     </section>
 
   </section>
@@ -215,293 +263,335 @@
 
 
 
-  <section xml:id="perf.zookeeper">
+  <section
+    xml:id="perf.zookeeper">
     <title>ZooKeeper</title>
-    <para>See <xref linkend="zookeeper"/> for information on configuring ZooKeeper, and see the part
-    about having a dedicated disk.
-    </para>
+    <para>See <xref
+        linkend="zookeeper" /> for information on configuring ZooKeeper, and see the part about
+      having a dedicated disk. </para>
   </section>
-  <section xml:id="perf.schema">
-      <title>Schema Design</title>
+  <section
+    xml:id="perf.schema">
+    <title>Schema Design</title>
 
-    <section xml:id="perf.number.of.cfs">
+    <section
+      xml:id="perf.number.of.cfs">
       <title>Number of Column Families</title>
-      <para>See <xref linkend="number.of.cfs" />.</para>
+      <para>See <xref
+          linkend="number.of.cfs" />.</para>
     </section>
-    <section xml:id="perf.schema.keys">
+    <section
+      xml:id="perf.schema.keys">
       <title>Key and Attribute Lengths</title>
-      <para>See <xref linkend="keysize" />.  See also <xref linkend="perf.compression.however" /> for
-      compression caveats.</para>
-    </section>
-    <section xml:id="schema.regionsize"><title>Table RegionSize</title>
-    <para>The regionsize can be set on a per-table basis via <code>setFileSize</code> on
-    <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HTableDescriptor.html">HTableDescriptor</link> in the
-    event where certain tables require different regionsizes than the configured default regionsize.
-    </para>
-    <para>See <xref linkend="ops.capacity.regions"/> for more information.
-    </para>
-    </section>
-    <section xml:id="schema.bloom">
-    <title>Bloom Filters</title>
-    <para>Bloom Filters can be enabled per-ColumnFamily.
-        Use <code>HColumnDescriptor.setBloomFilterType(NONE | ROW |
-        ROWCOL)</code> to enable blooms per Column Family. Default =
-        <varname>NONE</varname> for no bloom filters. If
-        <varname>ROW</varname>, the hash of the row will be added to the bloom
-        on each insert. If <varname>ROWCOL</varname>, the hash of the row +
-        column family name + column family qualifier will be added to the bloom on
-        each key insert.</para>
-    <para>See <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html">HColumnDescriptor</link> and
-    <xref linkend="blooms"/> for more information or this answer up in quora,
-<link xlink:href="http://www.quora.com/How-are-bloom-filters-used-in-HBase">How are bloom filters used in HBase?</link>.
-    </para>
-    </section>
-    <section xml:id="schema.cf.blocksize"><title>ColumnFamily BlockSize</title>
-    <para>The blocksize can be configured for each ColumnFamily in a table, and this defaults to 64k.  Larger cell values require larger blocksizes.
-    There is an inverse relationship between blocksize and the resulting StoreFile indexes (i.e., if the blocksize is doubled then the resulting
-    indexes should be roughly halved).
-    </para>
-    <para>See <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html">HColumnDescriptor</link>
-    and <xref linkend="store"/>for more information.
-    </para>
-    </section>
-    <section xml:id="cf.in.memory">
-    <title>In-Memory ColumnFamilies</title>
-    <para>ColumnFamilies can optionally be defined as in-memory.  Data is still persisted to disk, just like any other ColumnFamily.
-    In-memory blocks have the highest priority in the <xref linkend="block.cache" />, but it is not a guarantee that the entire table
-    will be in memory.
-    </para>
-    <para>See <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html">HColumnDescriptor</link> for more information.
-    </para>
-    </section>
-    <section xml:id="perf.compression">
+      <para>See <xref
+          linkend="keysize" />. See also <xref
+          linkend="perf.compression.however" /> for compression caveats.</para>
+    </section>
+    <section
+      xml:id="schema.regionsize">
+      <title>Table RegionSize</title>
+      <para>The regionsize can be set on a per-table basis via <code>setFileSize</code> on <link
+          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HTableDescriptor.html">HTableDescriptor</link>
+        in the event where certain tables require different regionsizes than the configured default
+        regionsize. </para>
+      <para>See <xref
+          linkend="ops.capacity.regions" /> for more information. </para>
+    </section>
+    <section
+      xml:id="schema.bloom">
+      <title>Bloom Filters</title>
+      <para>Bloom Filters can be enabled per-ColumnFamily. Use
+          <code>HColumnDescriptor.setBloomFilterType(NONE | ROW | ROWCOL)</code> to enable blooms
+        per Column Family. Default = <varname>NONE</varname> for no bloom filters. If
+          <varname>ROW</varname>, the hash of the row will be added to the bloom on each insert. If
+          <varname>ROWCOL</varname>, the hash of the row + column family name + column family
+        qualifier will be added to the bloom on each key insert.</para>
+      <para>See <link
+          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html">HColumnDescriptor</link>
+        and <xref
+          linkend="blooms" /> for more information or this answer up in quora, <link
+          xlink:href="http://www.quora.com/How-are-bloom-filters-used-in-HBase">How are bloom
+          filters used in HBase?</link>. </para>
+    </section>
+    <section
+      xml:id="schema.cf.blocksize">
+      <title>ColumnFamily BlockSize</title>
+      <para>The blocksize can be configured for each ColumnFamily in a table, and this defaults to
+        64k. Larger cell values require larger blocksizes. There is an inverse relationship between
+        blocksize and the resulting StoreFile indexes (i.e., if the blocksize is doubled then the
+        resulting indexes should be roughly halved). </para>
+      <para>See <link
+          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html">HColumnDescriptor</link>
+        and <xref
+          linkend="store" />for more information. </para>
+    </section>
+    <section
+      xml:id="cf.in.memory">
+      <title>In-Memory ColumnFamilies</title>
+      <para>ColumnFamilies can optionally be defined as in-memory. Data is still persisted to disk,
+        just like any other ColumnFamily. In-memory blocks have the highest priority in the <xref
+          linkend="block.cache" />, but it is not a guarantee that the entire table will be in
+        memory. </para>
+      <para>See <link
+          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html">HColumnDescriptor</link>
+        for more information. </para>
+    </section>
+    <section
+      xml:id="perf.compression">
       <title>Compression</title>
-      <para>Production systems should use compression with their ColumnFamily definitions.  See <xref linkend="compression" /> for more information.
-      </para>
-      <section xml:id="perf.compression.however"><title>However...</title>
-         <para>Compression deflates data <emphasis>on disk</emphasis>.  When it's in-memory (e.g., in the
-         MemStore) or on the wire (e.g., transferring between RegionServer and Client) it's inflated.
-         So while using ColumnFamily compression is a best practice, but it's not going to completely eliminate
-         the impact of over-sized Keys, over-sized ColumnFamily names, or over-sized Column names.
-         </para>
-         <para>See <xref linkend="keysize" /> on for schema design tips, and <xref linkend="keyvalue"/> for more information on HBase stores data internally.
-         </para>
+      <para>Production systems should use compression with their ColumnFamily definitions. See <xref
+          linkend="compression" /> for more information. </para>
+      <section
+        xml:id="perf.compression.however">
+        <title>However...</title>
+        <para>Compression deflates data <emphasis>on disk</emphasis>. When it's in-memory (e.g., in
+          the MemStore) or on the wire (e.g., transferring between RegionServer and Client) it's
+          inflated. So while using ColumnFamily compression is a best practice, but it's not going
+          to completely eliminate the impact of over-sized Keys, over-sized ColumnFamily names, or
+          over-sized Column names. </para>
+        <para>See <xref
+            linkend="keysize" /> on for schema design tips, and <xref
+            linkend="keyvalue" /> for more information on HBase stores data internally. </para>
       </section>
     </section>
-  </section>  <!--  perf schema -->
+  </section>
+  <!--  perf schema -->
 
-  <section xml:id="perf.general">
+  <section
+    xml:id="perf.general">
     <title>HBase General Patterns</title>
-    <section xml:id="perf.general.constants">
+    <section
+      xml:id="perf.general.constants">
       <title>Constants</title>
-      <para>When people get started with HBase they have a tendency to write code that looks like this:
-<programlisting>
+      <para>When people get started with HBase they have a tendency to write code that looks like
+        this:</para>
+      <programlisting>
 Get get = new Get(rowkey);
 Result r = htable.get(get);
 byte[] b = r.getValue(Bytes.toBytes("cf"), Bytes.toBytes("attr"));  // returns current version of value
-</programlisting>
-		But especially when inside loops (and MapReduce jobs), converting the columnFamily and column-names
-		to byte-arrays repeatedly is surprisingly expensive.
-		It's better to use constants for the byte-arrays, like this:
-<programlisting>
+      </programlisting>
+      <para>But especially when inside loops (and MapReduce jobs), converting the columnFamily and
+        column-names to byte-arrays repeatedly is surprisingly expensive. It's better to use
+        constants for the byte-arrays, like this:</para>
+      <programlisting>
 public static final byte[] CF = "cf".getBytes();
 public static final byte[] ATTR = "attr".getBytes();
 ...
 Get get = new Get(rowkey);
 Result r = htable.get(get);
 byte[] b = r.getValue(CF, ATTR);  // returns current version of value
-</programlisting>
-      </para>
+      </programlisting>
     </section>
 
   </section>
-  <section xml:id="perf.writing">
+  <section
+    xml:id="perf.writing">
     <title>Writing to HBase</title>
 
-    <section xml:id="perf.batch.loading">
+    <section
+      xml:id="perf.batch.loading">
       <title>Batch Loading</title>
-      <para>Use the bulk load tool if you can.  See
-        <xref linkend="arch.bulk.load"/>.
-        Otherwise, pay attention to the below.
-      </para>
-    </section>  <!-- batch loading -->
-
-    <section xml:id="precreate.regions">
-    <title>
-    Table Creation: Pre-Creating Regions
-    </title>
-<para>
-Tables in HBase are initially created with one region by default.  For bulk imports, this means that all clients will write to the same region
-until it is large enough to split and become distributed across the cluster.  A useful pattern to speed up the bulk import process is to pre-create empty regions.
- Be somewhat conservative in this, because too-many regions can actually degrade performance.
-</para>
-	<para>There are two different approaches to pre-creating splits.  The first approach is to rely on the default <code>HBaseAdmin</code> strategy
-	(which is implemented in <code>Bytes.split</code>)...
-	</para>
-<programlisting>
+      <para>Use the bulk load tool if you can. See <xref
+          linkend="arch.bulk.load" />. Otherwise, pay attention to the below. </para>
+    </section>
+    <!-- batch loading -->
+
+    <section
+      xml:id="precreate.regions">
+      <title> Table Creation: Pre-Creating Regions </title>
+      <para> Tables in HBase are initially created with one region by default. For bulk imports,
+        this means that all clients will write to the same region until it is large enough to split
+        and become distributed across the cluster. A useful pattern to speed up the bulk import
+        process is to pre-create empty regions. Be somewhat conservative in this, because too-many
+        regions can actually degrade performance. </para>
+      <para>There are two different approaches to pre-creating splits. The first approach is to rely
+        on the default <code>HBaseAdmin</code> strategy (which is implemented in
+          <code>Bytes.split</code>)... </para>
+      <programlisting>
 byte[] startKey = ...;   	// your lowest keuy
 byte[] endKey = ...;   		// your highest key
 int numberOfRegions = ...;	// # of regions to create
 admin.createTable(table, startKey, endKey, numberOfRegions);
-</programlisting>
-	<para>And the other approach is to define the splits yourself...
-	</para>
-<programlisting>
+      </programlisting>
+      <para>And the other approach is to define the splits yourself... </para>
+      <programlisting>
 byte[][] splits = ...;   // create your own splits
 admin.createTable(table, splits);
 </programlisting>
-<para>
-   See <xref linkend="rowkey.regionsplits"/> for issues related to understanding your keyspace and pre-creating regions.
-  </para>
-  </section>
-    <section xml:id="def.log.flush">
-    <title>
-    Table Creation: Deferred Log Flush
-    </title>
-<para>
-The default behavior for Puts using the Write Ahead Log (WAL) is that <classname>HLog</classname> edits will be written immediately.  If deferred log flush is used,
-WAL edits are kept in memory until the flush period.  The benefit is aggregated and asynchronous <classname>HLog</classname>- writes, but the potential downside is that if
- the RegionServer goes down the yet-to-be-flushed edits are lost.  This is safer, however, than not using WAL at all with Puts.
-</para>
-<para>
-Deferred log flush can be configured on tables via <link
-      xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HTableDescriptor.html">HTableDescriptor</link>.  The default value of <varname>hbase.regionserver.optionallogflushinterval</varname> is 1000ms.
-</para>
-    </section>
-
-    <section xml:id="perf.hbase.client.autoflush">
-      <title>HBase Client:  AutoFlush</title>
-
-      <para>When performing a lot of Puts, make sure that setAutoFlush is set
-      to false on your <link
-      xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html">HTable</link>
-      instance. Otherwise, the Puts will be sent one at a time to the
-      RegionServer. Puts added via <code> htable.add(Put)</code> and <code> htable.add( &lt;List&gt; Put)</code>
-      wind up in the same write buffer. If <code>autoFlush = false</code>,
-      these messages are not sent until the write-buffer is filled. To
-      explicitly flush the messages, call <methodname>flushCommits</methodname>.
-      Calling <methodname>close</methodname> on the <classname>HTable</classname>
-      instance will invoke <methodname>flushCommits</methodname>.</para>
-    </section>
-    <section xml:id="perf.hbase.client.putwal">
-      <title>HBase Client:  Turn off WAL on Puts</title>
-      <para>A frequently discussed option for increasing throughput on <classname>Put</classname>s is to call <code>writeToWAL(false)</code>.  Turning this off means
-          that the RegionServer will <emphasis>not</emphasis> write the <classname>Put</classname> to the Write Ahead Log,
-          only into the memstore, HOWEVER the consequence is that if there
-          is a RegionServer failure <emphasis>there will be data loss</emphasis>.
-          If <code>writeToWAL(false)</code> is used, do so with extreme caution.  You may find in actuality that
-          it makes little difference if your load is well distributed across the cluster.
-      </para>
-      <para>In general, it is best to use WAL for Puts, and where loading throughput
-          is a concern to use <link linkend="perf.batch.loading">bulk loading</link> techniques instead.
-      </para>
-    </section>
-    <section xml:id="perf.hbase.client.regiongroup">
+      <para> See <xref
+          linkend="rowkey.regionsplits" /> for issues related to understanding your keyspace and
+        pre-creating regions. </para>
+    </section>
+    <section
+      xml:id="def.log.flush">
+      <title> Table Creation: Deferred Log Flush </title>
+      <para> The default behavior for Puts using the Write Ahead Log (WAL) is that
+          <classname>HLog</classname> edits will be written immediately. If deferred log flush is
+        used, WAL edits are kept in memory until the flush period. The benefit is aggregated and
+        asynchronous <classname>HLog</classname>- writes, but the potential downside is that if the
+        RegionServer goes down the yet-to-be-flushed edits are lost. This is safer, however, than
+        not using WAL at all with Puts. </para>
+      <para> Deferred log flush can be configured on tables via <link
+          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HTableDescriptor.html">HTableDescriptor</link>.
+        The default value of <varname>hbase.regionserver.optionallogflushinterval</varname> is
+        1000ms. </para>
+    </section>
+
+    <section
+      xml:id="perf.hbase.client.autoflush">
+      <title>HBase Client: AutoFlush</title>
+
+      <para>When performing a lot of Puts, make sure that setAutoFlush is set to false on your <link
+          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html">HTable</link>
+        instance. Otherwise, the Puts will be sent one at a time to the RegionServer. Puts added via
+          <code> htable.add(Put)</code> and <code> htable.add( &lt;List&gt; Put)</code> wind up in
+        the same write buffer. If <code>autoFlush = false</code>, these messages are not sent until
+        the write-buffer is filled. To explicitly flush the messages, call
+          <methodname>flushCommits</methodname>. Calling <methodname>close</methodname> on the
+          <classname>HTable</classname> instance will invoke
+        <methodname>flushCommits</methodname>.</para>
+    </section>
+    <section
+      xml:id="perf.hbase.client.putwal">
+      <title>HBase Client: Turn off WAL on Puts</title>
+      <para>A frequently discussed option for increasing throughput on <classname>Put</classname>s
+        is to call <code>writeToWAL(false)</code>. Turning this off means that the RegionServer will
+          <emphasis>not</emphasis> write the <classname>Put</classname> to the Write Ahead Log, only
+        into the memstore, HOWEVER the consequence is that if there is a RegionServer failure
+          <emphasis>there will be data loss</emphasis>. If <code>writeToWAL(false)</code> is used,
+        do so with extreme caution. You may find in actuality that it makes little difference if
+        your load is well distributed across the cluster. </para>
+      <para>In general, it is best to use WAL for Puts, and where loading throughput is a concern to
+        use <link
+          linkend="perf.batch.loading">bulk loading</link> techniques instead. </para>
+    </section>
+    <section
+      xml:id="perf.hbase.client.regiongroup">
       <title>HBase Client: Group Puts by RegionServer</title>
-      <para>In addition to using the writeBuffer, grouping <classname>Put</classname>s by RegionServer can reduce the number of client RPC calls per writeBuffer flush.
-      There is a utility <classname>HTableUtil</classname> currently on TRUNK that does this, but you can either copy that or implement your own version for
-      those still on 0.90.x or earlier.
-      </para>
-    </section>
-    <section xml:id="perf.hbase.write.mr.reducer">
-      <title>MapReduce:  Skip The Reducer</title>
+      <para>In addition to using the writeBuffer, grouping <classname>Put</classname>s by
+        RegionServer can reduce the number of client RPC calls per writeBuffer flush. There is a
+        utility <classname>HTableUtil</classname> currently on TRUNK that does this, but you can
+        either copy that or implement your own version for those still on 0.90.x or earlier. </para>
+    </section>
+    <section
+      xml:id="perf.hbase.write.mr.reducer">
+      <title>MapReduce: Skip The Reducer</title>
       <para>When writing a lot of data to an HBase table from a MR job (e.g., with <link
-      xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.html">TableOutputFormat</link>), and specifically where Puts are being emitted
-      from the Mapper, skip the Reducer step.  When a Reducer step is used, all of the output (Puts) from the Mapper will get spooled to disk, then sorted/shuffled to other
-      Reducers that will most likely be off-node.  It's far more efficient to just write directly to HBase.
-      </para>
-      <para>For summary jobs where HBase is used as a source and a sink, then writes will be coming from the Reducer step (e.g., summarize values then write out result).
-      This is a different processing problem than from the the above case.
+          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.html">TableOutputFormat</link>),
+        and specifically where Puts are being emitted from the Mapper, skip the Reducer step. When a
+        Reducer step is used, all of the output (Puts) from the Mapper will get spooled to disk,
+        then sorted/shuffled to other Reducers that will most likely be off-node. It's far more
+        efficient to just write directly to HBase. </para>
+      <para>For summary jobs where HBase is used as a source and a sink, then writes will be coming
+        from the Reducer step (e.g., summarize values then write out result). This is a different
+        processing problem than from the the above case. </para>
+    </section>
+
+    <section
+      xml:id="perf.one.region">
+      <title>Anti-Pattern: One Hot Region</title>
+      <para>If all your data is being written to one region at a time, then re-read the section on
+        processing <link
+          linkend="timeseries">timeseries</link> data.</para>
+      <para>Also, if you are pre-splitting regions and all your data is <emphasis>still</emphasis>
+        winding up in a single region even though your keys aren't monotonically increasing, confirm
+        that your keyspace actually works with the split strategy. There are a variety of reasons
+        that regions may appear "well split" but won't work with your data. As the HBase client
+        communicates directly with the RegionServers, this can be obtained via <link
+          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#getRegionLocation%28byte[]%29">HTable.getRegionLocation</link>. </para>
+      <para>See <xref
+          linkend="precreate.regions" />, as well as <xref
+          linkend="perf.configurations" />
       </para>
     </section>
 
-  <section xml:id="perf.one.region">
-    <title>Anti-Pattern:  One Hot Region</title>
-    <para>If all your data is being written to one region at a time, then re-read the
-    section on processing <link linkend="timeseries">timeseries</link> data.</para>
-    <para>Also, if you are pre-splitting regions and all your data is <emphasis>still</emphasis> winding up in a single region even though
-    your keys aren't monotonically increasing, confirm that your keyspace actually works with the split strategy.  There are a
-    variety of reasons that regions may appear "well split" but won't work with your data.   As
-    the HBase client communicates directly with the RegionServers, this can be obtained via
-    <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#getRegionLocation%28byte[]%29">HTable.getRegionLocation</link>.
-    </para>
-    <para>See <xref linkend="precreate.regions"/>, as well as <xref linkend="perf.configurations"/> </para>
   </section>
+  <!--  writing -->
 
-  </section>  <!--  writing -->
-
-  <section xml:id="perf.reading">
+  <section
+    xml:id="perf.reading">
     <title>Reading from HBase</title>
-    <para>The mailing list can help if you are having performance issues.
-    For example, here is a good general thread on what to look at addressing
-    read-time issues: <link xlink:href="http://search-hadoop.com/m/qOo2yyHtCC1">HBase Random Read latency > 100ms</link></para>
-    <section xml:id="perf.hbase.client.caching">
+    <para>The mailing list can help if you are having performance issues. For example, here is a
+      good general thread on what to look at addressing read-time issues: <link
+        xlink:href="http://search-hadoop.com/m/qOo2yyHtCC1">HBase Random Read latency >
+      100ms</link></para>
+    <section
+      xml:id="perf.hbase.client.caching">
       <title>Scan Caching</title>
 
-      <para>If HBase is used as an input source for a MapReduce job, for
-      example, make sure that the input <link
-      xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">Scan</link>
-      instance to the MapReduce job has <methodname>setCaching</methodname> set to something greater
-      than the default (which is 1). Using the default value means that the
-      map-task will make call back to the region-server for every record
-      processed. Setting this value to 500, for example, will transfer 500
-      rows at a time to the client to be processed. There is a cost/benefit to
-      have the cache value be large because it costs more in memory for both
-      client and RegionServer, so bigger isn't always better.</para>
-      <section xml:id="perf.hbase.client.caching.mr">
+      <para>If HBase is used as an input source for a MapReduce job, for example, make sure that the
+        input <link
+          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">Scan</link>
+        instance to the MapReduce job has <methodname>setCaching</methodname> set to something
+        greater than the default (which is 1). Using the default value means that the map-task will
+        make call back to the region-server for every record processed. Setting this value to 500,
+        for example, will transfer 500 rows at a time to the client to be processed. There is a
+        cost/benefit to have the cache value be large because it costs more in memory for both
+        client and RegionServer, so bigger isn't always better.</para>
+      <section
+        xml:id="perf.hbase.client.caching.mr">
         <title>Scan Caching in MapReduce Jobs</title>
-        <para>Scan settings in MapReduce jobs deserve special attention.  Timeouts can result (e.g., UnknownScannerException)
-        in Map tasks if it takes longer to process a batch of records before the client goes back to the RegionServer for the
-        next set of data.  This problem can occur because there is non-trivial processing occuring per row.  If you process
-        rows quickly, set caching higher.  If you process rows more slowly (e.g., lots of transformations per row, writes),
-        then set caching lower.
-        </para>
-        <para>Timeouts can also happen in a non-MapReduce use case (i.e., single threaded HBase client doing a Scan), but the
-        processing that is often performed in MapReduce jobs tends to exacerbate this issue.
-        </para>
+        <para>Scan settings in MapReduce jobs deserve special attention. Timeouts can result (e.g.,
+          UnknownScannerException) in Map tasks if it takes longer to process a batch of records
+          before the client goes back to the RegionServer for the next set of data. This problem can
+          occur because there is non-trivial processing occuring per row. If you process rows
+          quickly, set caching higher. If you process rows more slowly (e.g., lots of
+          transformations per row, writes), then set caching lower. </para>
+        <para>Timeouts can also happen in a non-MapReduce use case (i.e., single threaded HBase
+          client doing a Scan), but the processing that is often performed in MapReduce jobs tends
+          to exacerbate this issue. </para>
       </section>
     </section>
-    <section xml:id="perf.hbase.client.selection">
+    <section
+      xml:id="perf.hbase.client.selection">
       <title>Scan Attribute Selection</title>
 
-      <para>Whenever a Scan is used to process large numbers of rows (and especially when used
-      as a MapReduce source), be aware of which attributes are selected.   If <code>scan.addFamily</code> is called
-      then <emphasis>all</emphasis> of the attributes in the specified ColumnFamily will be returned to the client.
-      If only a small number of the available attributes are to be processed, then only those attributes should be specified
-      in the input scan because attribute over-selection is a non-trivial performance penalty over large datasets.
-      </para>
+      <para>Whenever a Scan is used to process large numbers of rows (and especially when used as a
+        MapReduce source), be aware of which attributes are selected. If <code>scan.addFamily</code>
+        is called then <emphasis>all</emphasis> of the attributes in the specified ColumnFamily will
+        be returned to the client. If only a small number of the available attributes are to be
+        processed, then only those attributes should be specified in the input scan because
+        attribute over-selection is a non-trivial performance penalty over large datasets. </para>
     </section>
-    <section xml:id="perf.hbase.client.seek">
+    <section
+      xml:id="perf.hbase.client.seek">
       <title>Avoid scan seeks</title>
-      <para>When columns are selected explicitly with <code>scan.addColumn</code>, HBase will schedule seek operations to seek between the
-      selected columns. When rows have few columns and each column has only a few versions this can be inefficient. A seek operation is generally
-      slower if does not seek at least past 5-10 columns/versions or 512-1024 bytes.</para>
-      <para>In order to opportunistically look ahead a few columns/versions to see if the next column/version can be found that
-      way before a seek operation is scheduled, a new attribute <code>Scan.HINT_LOOKAHEAD</code> can be set the on Scan object. The following code instructs the
-      RegionServer to attempt two iterations of next before a seek is scheduled:<programlisting>
+      <para>When columns are selected explicitly with <code>scan.addColumn</code>, HBase will
+        schedule seek operations to seek between the selected columns. When rows have few columns
+        and each column has only a few versions this can be inefficient. A seek operation is
+        generally slower if does not seek at least past 5-10 columns/versions or 512-1024
+        bytes.</para>
+      <para>In order to opportunistically look ahead a few columns/versions to see if the next
+        column/version can be found that way before a seek operation is scheduled, a new attribute
+          <code>Scan.HINT_LOOKAHEAD</code> can be set the on Scan object. The following code
+        instructs the RegionServer to attempt two iterations of next before a seek is
+        scheduled:</para>
+      <programlisting>
 Scan scan = new Scan();
 scan.addColumn(...);
 scan.setAttribute(Scan.HINT_LOOKAHEAD, Bytes.toBytes(2));
 table.getScanner(scan);
-</programlisting></para>
-	</section>
-    <section xml:id="perf.hbase.mr.input">
-        <title>MapReduce - Input Splits</title>
-        <para>For MapReduce jobs that use HBase tables as a source, if there a pattern where the "slow" map tasks seem to
-        have the same Input Split (i.e., the RegionServer serving the data), see the
-        Troubleshooting Case Study in <xref linkend="casestudies.slownode"/>.
-        </para>
+      </programlisting>
+    </section>
+    <section
+      xml:id="perf.hbase.mr.input">
+      <title>MapReduce - Input Splits</title>
+      <para>For MapReduce jobs that use HBase tables as a source, if there a pattern where the
+        "slow" map tasks seem to have the same Input Split (i.e., the RegionServer serving the
+        data), see the Troubleshooting Case Study in <xref
+          linkend="casestudies.slownode" />. </para>
     </section>
 
-    <section xml:id="perf.hbase.client.scannerclose">
+    <section
+      xml:id="perf.hbase.client.scannerclose">
       <title>Close ResultScanners</title>
 
-      <para>This isn't so much about improving performance but rather
-      <emphasis>avoiding</emphasis> performance problems. If you forget to
-      close <link
-      xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/ResultScanner.html">ResultScanners</link>
-      you can cause problems on the RegionServers. Always have ResultScanner
-      processing enclosed in try/catch blocks... <programlisting>
+      <para>This isn't so much about improving performance but rather <emphasis>avoiding</emphasis>
+        performance problems. If you forget to close <link
+          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/ResultScanner.html">ResultScanners</link>
+        you can cause problems on the RegionServers. Always have ResultScanner processing enclosed
+        in try/catch blocks...</para>
+      <programlisting>
 Scan scan = new Scan();
 // set attrs...
 ResultScanner rs = htable.getScanner(scan);
@@ -511,184 +601,201 @@ try {
 } finally {
   rs.close();  // always close the ResultScanner!
 }
-htable.close();</programlisting></para>
+htable.close();
+      </programlisting>
     </section>
 
-    <section xml:id="perf.hbase.client.blockcache">
+    <section
+      xml:id="perf.hbase.client.blockcache">
       <title>Block Cache</title>
 
       <para><link
-      xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">Scan</link>
-      instances can be set to use the block cache in the RegionServer via the
-      <methodname>setCacheBlocks</methodname> method. For input Scans to MapReduce jobs, this should be
-      <varname>false</varname>. For frequently accessed rows, it is advisable to use the block
-      cache.</para>
-    </section>
-    <section xml:id="perf.hbase.client.rowkeyonly">
+          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">Scan</link>
+        instances can be set to use the block cache in the RegionServer via the
+          <methodname>setCacheBlocks</methodname> method. For input Scans to MapReduce jobs, this
+        should be <varname>false</varname>. For frequently accessed rows, it is advisable to use the
+        block cache.</para>
+    </section>
+    <section
+      xml:id="perf.hbase.client.rowkeyonly">
       <title>Optimal Loading of Row Keys</title>
-      <para>When performing a table <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">scan</link>
-            where only the row keys are needed (no families, qualifiers, values or timestamps), add a FilterList with a
-            <varname>MUST_PASS_ALL</varname> operator to the scanner using <methodname>setFilter</methodname>. The filter list
-            should include both a <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html">FirstKeyOnlyFilter</link>
-            and a <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/KeyOnlyFilter.html">KeyOnlyFilter</link>.
-            Using this filter combination will result in a worst case scenario of a RegionServer reading a single value from disk
-            and minimal network traffic to the client for a single row.
+      <para>When performing a table <link
+          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">scan</link>
+        where only the row keys are needed (no families, qualifiers, values or timestamps), add a
+        FilterList with a <varname>MUST_PASS_ALL</varname> operator to the scanner using
+          <methodname>setFilter</methodname>. The filter list should include both a <link
+          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html">FirstKeyOnlyFilter</link>
+        and a <link
+          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/KeyOnlyFilter.html">KeyOnlyFilter</link>.
+        Using this filter combination will result in a worst case scenario of a RegionServer reading
+        a single value from disk and minimal network traffic to the client for a single row. </para>
+    </section>
+    <section
+      xml:id="perf.hbase.read.dist">
+      <title>Concurrency: Monitor Data Spread</title>
+      <para>When performing a high number of concurrent reads, monitor the data spread of the target
+        tables. If the target table(s) have too few regions then the reads could likely be served
+        from too few nodes. </para>
+      <para>See <xref
+          linkend="precreate.regions" />, as well as <xref
+          linkend="perf.configurations" />
       </para>
     </section>
-   <section xml:id="perf.hbase.read.dist">
-      <title>Concurrency:  Monitor Data Spread</title>
-      <para>When performing a high number of concurrent reads, monitor the data spread of the target tables.  If the target table(s) have
-      too few regions then the reads could likely be served from too few nodes.  </para>
-      <para>See <xref linkend="precreate.regions"/>, as well as <xref linkend="perf.configurations"/> </para>
-   </section>
-     <section xml:id="blooms">
-     <title>Bloom Filters</title>
-         <para>Enabling Bloom Filters can save your having to go to disk and
-         can help improve read latencies.</para>
-         <para><link xlink:href="http://en.wikipedia.org/wiki/Bloom_filter">Bloom filters</link> were developed over in <link
-    xlink:href="https://issues.apache.org/jira/browse/HBASE-1200">HBase-1200
-    Add bloomfilters</link>.<footnote>
-        <para>For description of the development process -- why static blooms
-        rather than dynamic -- and for an overview of the unique properties
-        that pertain to blooms in HBase, as well as possible future
-        directions, see the <emphasis>Development Process</emphasis> section
-        of the document <link
-        xlink:href="https://issues.apache.org/jira/secure/attachment/12444007/Bloom_Filters_in_HBase.pdf">BloomFilters
-        in HBase</link> attached to <link
-        xlink:href="https://issues.apache.org/jira/browse/HBASE-1200">HBase-1200</link>.</para>
-      </footnote><footnote>
-        <para>The bloom filters described here are actually version two of
-        blooms in HBase. In versions up to 0.19.x, HBase had a dynamic bloom
-        option based on work done by the <link
-        xlink:href="http://www.one-lab.org">European Commission One-Lab
-        Project 034819</link>. The core of the HBase bloom work was later
-        pulled up into Hadoop to implement org.apache.hadoop.io.BloomMapFile.
-        Version 1 of HBase blooms never worked that well. Version 2 is a
-        rewrite from scratch though again it starts with the one-lab
-        work.</para>
-      </footnote></para>
-        <para>See also <xref linkend="schema.bloom" />.
-        </para>
-
-     <section xml:id="bloom_footprint">
-      <title>Bloom StoreFile footprint</title>
-
-      <para>Bloom filters add an entry to the <classname>StoreFile</classname>
-      general <classname>FileInfo</classname> data structure and then two
-      extra entries to the <classname>StoreFile</classname> metadata
-      section.</para>
-
-      <section>
-        <title>BloomFilter in the <classname>StoreFile</classname>
-        <classname>FileInfo</classname> data structure</title>
-
-          <para><classname>FileInfo</classname> has a
-          <varname>BLOOM_FILTER_TYPE</varname> entry which is set to
-          <varname>NONE</varname>, <varname>ROW</varname> or
-          <varname>ROWCOL.</varname></para>
-      </section>
+    <section
+      xml:id="blooms">
+      <title>Bloom Filters</title>
+      <para>Enabling Bloom Filters can save your having to go to disk and can help improve read
+        latencies.</para>
+      <para><link
+          xlink:href="http://en.wikipedia.org/wiki/Bloom_filter">Bloom filters</link> were developed
+        over in <link
+          xlink:href="https://issues.apache.org/jira/browse/HBASE-1200">HBase-1200 Add
+          bloomfilters</link>.<footnote>
+          <para>For description of the development process -- why static blooms rather than dynamic
+            -- and for an overview of the unique properties that pertain to blooms in HBase, as well
+            as possible future directions, see the <emphasis>Development Process</emphasis> section
+            of the document <link
+              xlink:href="https://issues.apache.org/jira/secure/attachment/12444007/Bloom_Filters_in_HBase.pdf">BloomFilters
+              in HBase</link> attached to <link
+              xlink:href="https://issues.apache.org/jira/browse/HBASE-1200">HBase-1200</link>.</para>
+        </footnote><footnote>
+          <para>The bloom filters described here are actually version two of blooms in HBase. In
+            versions up to 0.19.x, HBase had a dynamic bloom option based on work done by the <link
+              xlink:href="http://www.one-lab.org">European Commission One-Lab Project 034819</link>.
+            The core of the HBase bloom work was later pulled up into Hadoop to implement
+            org.apache.hadoop.io.BloomMapFile. Version 1 of HBase blooms never worked that well.
+            Version 2 is a rewrite from scratch though again it starts with the one-lab work.</para>
+        </footnote></para>
+      <para>See also <xref
+          linkend="schema.bloom" />. </para>
+
+      <section
+        xml:id="bloom_footprint">
+        <title>Bloom StoreFile footprint</title>
+
+        <para>Bloom filters add an entry to the <classname>StoreFile</classname> general
+            <classname>FileInfo</classname> data structure and then two extra entries to the
+            <classname>StoreFile</classname> metadata section.</para>
+
+        <section>
+          <title>BloomFilter in the <classname>StoreFile</classname>
+            <classname>FileInfo</classname> data structure</title>
 
-      <section>
-        <title>BloomFilter entries in <classname>StoreFile</classname>
-        metadata</title>
+          <para><classname>FileInfo</classname> has a <varname>BLOOM_FILTER_TYPE</varname> entry
+            which is set to <varname>NONE</varname>, <varname>ROW</varname> or
+              <varname>ROWCOL.</varname></para>
+        </section>
+
+        <section>
+          <title>BloomFilter entries in <classname>StoreFile</classname> metadata</title>
 
-          <para><varname>BLOOM_FILTER_META</varname> holds Bloom Size, Hash
-          Function used, etc. Its small in size and is cached on
-          <classname>StoreFile.Reader</classname> load</para>
-          <para><varname>BLOOM_FILTER_DATA</varname> is the actual bloomfilter
-          data. Obtained on-demand. Stored in the LRU cache, if it is enabled
-          (Its enabled by default).</para>
+          <para><varname>BLOOM_FILTER_META</varname> holds Bloom Size, Hash Function used, etc. Its
+            small in size and is cached on <classname>StoreFile.Reader</classname> load</para>
+          <para><varname>BLOOM_FILTER_DATA</varname> is the actual bloomfilter data. Obtained
+            on-demand. Stored in the LRU cache, if it is enabled (Its enabled by default).</para>
+        </section>
       </section>
-     </section>
-	  <section xml:id="config.bloom">
-	    <title>Bloom Filter Configuration</title>
+      <section
+        xml:id="config.bloom">
+        <title>Bloom Filter Configuration</title>
         <section>
-        <title><varname>io.hfile.bloom.enabled</varname> global kill
-        switch</title>
+          <title><varname>io.hfile.bloom.enabled</varname> global kill switch</title>
 
-        <para><code>io.hfile.bloom.enabled</code> in
-        <classname>Configuration</classname> serves as the kill switch in case
-        something goes wrong. Default = <varname>true</varname>.</para>
+          <para><code>io.hfile.bloom.enabled</code> in <classname>Configuration</classname> serves
+            as the kill switch in case something goes wrong. Default =
+            <varname>true</varname>.</para>
         </section>
 
         <section>
-        <title><varname>io.hfile.bloom.error.rate</varname></title>
+          <title><varname>io.hfile.bloom.error.rate</varname></title>
 
-        <para><varname>io.hfile.bloom.error.rate</varname> = average false
-        positive rate. Default = 1%. Decrease rate by ½ (e.g. to .5%) == +1
-        bit per bloom entry.</para>
+          <para><varname>io.hfile.bloom.error.rate</varname> = average false positive rate. Default
+            = 1%. Decrease rate by ½ (e.g. to .5%) == +1 bit per bloom entry.</para>
         </section>
 
         <section>
-        <title><varname>io.hfile.bloom.max.fold</varname></title>
-
-        <para><varname>io.hfile.bloom.max.fold</varname> = guaranteed minimum
-        fold rate. Most people should leave this alone. Default = 7, or can
-        collapse to at least 1/128th of original size. See the
-        <emphasis>Development Process</emphasis> section of the document <link
-        xlink:href="https://issues.apache.org/jira/secure/attachment/12444007/Bloom_Filters_in_HBase.pdf">BloomFilters
-        in HBase</link> for more on what this option means.</para>
-        </section>
+          <title><varname>io.hfile.bloom.max.fold</varname></title>
+
+          <para><varname>io.hfile.bloom.max.fold</varname> = guaranteed minimum fold rate. Most
+            people should leave this alone. Default = 7, or can collapse to at least 1/128th of
+            original size. See the <emphasis>Development Process</emphasis> section of the document <link
+              xlink:href="https://issues.apache.org/jira/secure/attachment/12444007/Bloom_Filters_in_HBase.pdf">BloomFilters
+              in HBase</link> for more on what this option means.</para>
         </section>
-     </section>   <!--  bloom  -->
+      </section>
+    </section>
+    <!--  bloom  -->
 
-  </section>  <!--  reading -->
+  </section>
+  <!--  reading -->
 
-  <section xml:id="perf.deleting">
+  <section
+    xml:id="perf.deleting">
     <title>Deleting from HBase</title>
-     <section xml:id="perf.deleting.queue">
-       <title>Using HBase Tables as Queues</title>
-       <para>HBase tables are sometimes used as queues.  In this case, special care must be taken to regularly perform major compactions on tables used in
-       this manner.  As is documented in <xref linkend="datamodel" />, marking rows as deleted creates additional StoreFiles which then need to be processed
-       on reads.  Tombstones only get cleaned up with major compactions.
-       </para>
-       <para>See also <xref linkend="compaction" /> and <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#majorCompact%28java.lang.String%29">HBaseAdmin.majorCompact</link>.
-       </para>
-     </section>
-     <section xml:id="perf.deleting.rpc">
-       <title>Delete RPC Behavior</title>
-       <para>Be aware that <code>htable.delete(Delete)</code> doesn't use the writeBuffer.  It will execute an RegionServer RPC with each invocation.
-       For a large number of deletes, consider <code>htable.delete(List)</code>.
-       </para>
-       <para>See <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#delete%28org.apache.hadoop.hbase.client.Delete%29"></link>
-       </para>
-     </section>
-  </section>  <!--  deleting -->
-
-  <section xml:id="perf.hdfs"><title>HDFS</title>
-   <para>Because HBase runs on <xref linkend="arch.hdfs" /> it is important to understand how it works and how it affects
-   HBase.
-   </para>
-    <section xml:id="perf.hdfs.curr"><title>Current Issues With Low-Latency Reads</title>
-      <para>The original use-case for HDFS was batch processing.  As such, there low-latency reads were historically not a priority.
-      With the increased adoption of Apache HBase this is changing, and several improvements are already in development.
-      See the
-      <link xlink:href="https://issues.apache.org/jira/browse/HDFS-1599">Umbrella Jira Ticket for HDFS Improvements for HBase</link>.
+    <section
+      xml:id="perf.deleting.queue">
+      <title>Using HBase Tables as Queues</title>
+      <para>HBase tables are sometimes used as queues. In this case, special care must be taken to
+        regularly perform major compactions on tables used in this manner. As is documented in <xref
+          linkend="datamodel" />, marking rows as deleted creates additional StoreFiles which then
+        need to be processed on reads. Tombstones only get cleaned up with major compactions. </para>
+      <para>See also <xref
+          linkend="compaction" /> and <link
+          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#majorCompact%28java.lang.String%29">HBaseAdmin.majorCompact</link>.
       </para>
     </section>
-    <section xml:id="perf.hdfs.configs.localread">
-    <title>Leveraging local data</title>
-<para>Since Hadoop 1.0.0 (also 0.22.1, 0.23.1, CDH3u3 and HDP 1.0) via
-<link xlink:href="https://issues.apache.org/jira/browse/HDFS-2246">HDFS-2246</link>,
-it is possible for the DFSClient to take a "short circuit" and
-read directly from the disk instead of going through the DataNode when the
-data is local. What this means for HBase is that the RegionServers can
-read directly off their machine's disks instead of having to open a
-socket to talk to the DataNode, the former being generally much
-faster<footnote><para>See JD's <link xlink:href="http://files.meetup.com/1350427/hug_ebay_jdcryans.pdf">Performance Talk</link></para></footnote>.
-Also see <link xlink:href="http://search-hadoop.com/m/zV6dKrLCVh1">HBase, mail # dev - read short circuit</link> thread for
-more discussion around short circuit reads.
-</para>
-<para>To enable "short circuit" reads, it will depend on your version of Hadoop.
-    The original shortcircuit read patch was much improved upon in Hadoop 2 in
-    <link xlink:href="https://issues.apache.org/jira/browse/HDFS-347">HDFS-347</link>.
-    See <link xlink:href="http://blog.cloudera.com/blog/2013/08/how-improved-short-circuit-local-reads-bring-better-performance-and-security-to-hadoop/"></link> for details
-    on the difference between the old and new implementations.  See
-    <link xlink:href="http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html">Hadoop shortcircuit reads configuration page</link>
-    for how to enable the latter, better version of shortcircuit.
-For example, here is a minimal config. enabling short-circuit reads added to
-<filename>hbase-site.xml</filename>:
-<programlisting><![CDATA[<property>
+    <section
+      xml:id="perf.deleting.rpc">
+      <title>Delete RPC Behavior</title>
+      <para>Be aware that <code>htable.delete(Delete)</code> doesn't use the writeBuffer. It will
+        execute an RegionServer RPC with each invocation. For a large number of deletes, consider
+          <code>htable.delete(List)</code>. </para>
+      <para>See <link
+          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#delete%28org.apache.hadoop.hbase.client.Delete%29" />
+      </para>
+    </section>
+  </section>
+  <!--  deleting -->
+
+  <section
+    xml:id="perf.hdfs">
+    <title>HDFS</title>
+    <para>Because HBase runs on <xref
+        linkend="arch.hdfs" /> it is important to understand how it works and how it affects HBase. </para>
+    <section
+      xml:id="perf.hdfs.curr">
+      <title>Current Issues With Low-Latency Reads</title>
+      <para>The original use-case for HDFS was batch processing. As such, there low-latency reads
+        were historically not a priority. With the increased adoption of Apache HBase this is
+        changing, and several improvements are already in development. See the <link
+          xlink:href="https://issues.apache.org/jira/browse/HDFS-1599">Umbrella Jira Ticket for HDFS
+          Improvements for HBase</link>. </para>
+    </section>
+    <section
+      xml:id="perf.hdfs.configs.localread">
+      <title>Leveraging local data</title>
+      <para>Since Hadoop 1.0.0 (also 0.22.1, 0.23.1, CDH3u3 and HDP 1.0) via <link
+          xlink:href="https://issues.apache.org/jira/browse/HDFS-2246">HDFS-2246</link>, it is
+        possible for the DFSClient to take a "short circuit" and read directly from the disk instead
+        of going through the DataNode when the data is local. What this means for HBase is that the
+        RegionServers can read directly off their machine's disks instead of having to open a socket
+        to talk to the DataNode, the former being generally much faster<footnote>
+          <para>See JD's <link
+              xlink:href="http://files.meetup.com/1350427/hug_ebay_jdcryans.pdf">Performance
+              Talk</link></para>
+        </footnote>. Also see <link
+          xlink:href="http://search-hadoop.com/m/zV6dKrLCVh1">HBase, mail # dev - read short
+          circuit</link> thread for more discussion around short circuit reads. </para>
+      <para>To enable "short circuit" reads, it will depend on your version of Hadoop. The original
+        shortcircuit read patch was much improved upon in Hadoop 2 in <link
+          xlink:href="https://issues.apache.org/jira/browse/HDFS-347">HDFS-347</link>. See <link
+          xlink:href="http://blog.cloudera.com/blog/2013/08/how-improved-short-circuit-local-reads-bring-better-performance-and-security-to-hadoop/" />
+        for details on the difference between the old and new implementations. See <link
+          xlink:href="http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html">Hadoop
+          shortcircuit reads configuration page</link> for how to enable the latter, better version
+        of shortcircuit. For example, here is a minimal config. enabling short-circuit reads added
+        to <filename>hbase-site.xml</filename>: </para>
+      <programlisting><![CDATA[<property>
   <name>dfs.client.read.shortcircuit</name>
   <value>true</value>
   <description>
@@ -705,74 +812,85 @@ For example, here is a minimal config. enabling short-circuit reads added to
     TCP port of the DataNode.
   </description>
 </property>]]></programlisting>
-Be careful about permissions for the directory that hosts the shared domain
-socket; dfsclient will complain if open to other than the hbase user.
-<footnote><para>If you are running on an old Hadoop, one that is without
-    <link xlink:href="https://issues.apache.org/jira/browse/HDFS-347">HDFS-347</link> but that
-    has
-<link xlink:href="https://issues.apache.org/jira/browse/HDFS-2246">HDFS-2246</link>,
-you must set two configurations.
-First, the hdfs-site.xml needs to be amended. Set
-the property  <varname>dfs.block.local-path-access.user</varname>
-to be the <emphasis>only</emphasis> user that can use the shortcut.
-This has to be the user that started HBase.  Then in hbase-site.xml,
-set <varname>dfs.client.read.shortcircuit</varname> to be <varname>true</varname>
-</para></footnote>
-</para>
-<para>
-Services -- at least the HBase RegionServers -- will need to be restarted in order to pick up the new
-configurations.
-</para>
-<note xml:id="dfs.client.read.shortcircuit.buffer.size">
-    <title>dfs.client.read.shortcircuit.buffer.size</title>
-    <para>The default for this value is too high when running on a highly trafficed HBase.
-        In HBase, if this value has not been set, we set it down from the default of 1M to 128k
-        (Since HBase 0.98.0 and 0.96.1).  See <link xlink:href="https://issues.apache.org/jira/browse/HBASE-8143">HBASE-8143 HBase on Hadoop 2 with local short circuit reads (ssr) causes OOM</link>).
-        The Hadoop DFSClient in HBase will allocate a direct byte buffer of this size for <emphasis>each</emphasis>
-    block it has open; given HBase keeps its HDFS files open all the time, this can add up quickly.</para>
-</note>
-  </section>
-
-    <section xml:id="perf.hdfs.comp"><title>Performance Comparisons of HBase vs. HDFS</title>
-     <para>A fairly common question on the dist-list is why HBase isn't as performant as HDFS files in a batch context (e.g., as
-     a MapReduce source or sink).  The short answer is that HBase is doing a lot more than HDFS (e.g., reading the KeyValues,
-     returning the most current row or specified timestamps, etc.), and as such HBase is 4-5 times slower than HDFS in this
-     processing context.  There is room for improvement and this gap will, over time, be reduced, but HDFS
-      will always be faster in this use-case.
-     </para>
+      <para>Be careful about permissions for the directory that hosts the shared domain socket;
+        dfsclient will complain if open to other than the hbase user. <footnote>
+          <para>If you are running on an old Hadoop, one that is without <link
+              xlink:href="https://issues.apache.org/jira/browse/HDFS-347">HDFS-347</link> but that
+            has <link
+              xlink:href="https://issues.apache.org/jira/browse/HDFS-2246">HDFS-2246</link>, you
+            must set two configurations. First, the hdfs-site.xml needs to be amended. Set the
+            property <varname>dfs.block.local-path-access.user</varname> to be the
+              <emphasis>only</emphasis> user that can use the shortcut. This has to be the user that
+            started HBase. Then in hbase-site.xml, set
+              <varname>dfs.client.read.shortcircuit</varname> to be <varname>true</varname>
+          </para>
+        </footnote>
+      </para>
+      <para> Services -- at least the HBase RegionServers -- will need to be restarted in order to
+        pick up the new configurations. </para>
+      <note
+        xml:id="dfs.client.read.shortcircuit.buffer.size">
+        <title>dfs.client.read.shortcircuit.buffer.size</title>
+        <para>The default for this value is too high when running on a highly trafficed HBase. In
+          HBase, if this value has not been set, we set it down from the default of 1M to 128k
+          (Since HBase 0.98.0 and 0.96.1). See <link
+            xlink:href="https://issues.apache.org/jira/browse/HBASE-8143">HBASE-8143 HBase on Hadoop
+            2 with local short circuit reads (ssr) causes OOM</link>). The Hadoop DFSClient in HBase
+          will allocate a direct byte buffer of this size for <emphasis>each</emphasis> block it has
+          open; given HBase keeps its HDFS files open all the time, this can add up quickly.</para>
+      </note>
+    </section>
+
+    <section
+      xml:id="perf.hdfs.comp">
+      <title>Performance Comparisons of HBase vs. HDFS</title>
+      <para>A fairly common question on the dist-list is why HBase isn't as performant as HDFS files
+        in a batch context (e.g., as a MapReduce source or sink). The short answer is that HBase is
+        doing a lot more than HDFS (e.g., reading the KeyValues, returning the most current row or
+        specified timestamps, etc.), and as such HBase is 4-5 times slower than HDFS in this
+        processing context. There is room for improvement and this gap will, over time, be reduced,
+        but HDFS will always be faster in this use-case. </para>
     </section>
   </section>
 
-  <section xml:id="perf.ec2"><title>Amazon EC2</title>
-   <para>Performance questions are common on Amazon EC2 environments because it is a shared environment.  You will
-   not see the same throughput as a dedicated server.  In terms of running tests on EC2, run them several times for the same
-   reason (i.e., it's a shared environment and you don't know what else is happening on the server).
-   </para>
-   <para>If you are running on EC2 and post performance questions on the dist-list, please state this fact up-front that
-    because EC2 issues are practically a separate class of performance issues.
-   </para>
+  <section
+    xml:id="perf.ec2">
+    <title>Amazon EC2</title>
+    <para>Performance questions are common on Amazon EC2 environments because it is a shared
+      environment. You will not see the same throughput as a dedicated server. In terms of running
+      tests on EC2, run them several times for the same reason (i.e., it's a shared environment and
+      you don't know what else is happening on the server). </para>
+    <para>If you are running on EC2 and post performance questions on the dist-list, please state
+      this fact up-front that because EC2 issues are practically a separate class of performance
+      issues. </para>
   </section>
 
-  <section xml:id="perf.hbase.mr.cluster"><title>Collocating HBase and MapReduce</title>
-    <para>It is often recommended to have different clusters for HBase and MapReduce. A better qualification of this is:
-          don't collocate a HBase that serves live requests with a heavy MR workload. OLTP and OLAP-optimized systems have
-          conflicting requirements and one will lose to the other, usually the former. For example, short latency-sensitive
-          disk reads will have to wait in line behind longer reads that are trying to squeeze out as much throughput as
-          possible. MR jobs that write to HBase will also generate flushes and compactions, which will in turn invalidate
-          blocks in the <xref linkend="block.cache"/>.
-    </para>
-    <para>If you need to process the data from your live HBase cluster in MR, you can ship the deltas with <xref linkend="copy.table"/>
-          or use replication to get the new data in real time on the OLAP cluster. In the worst case, if you really need to
-          collocate both, set MR to use less Map and Reduce slots than you'd normally configure, possibly just one.
-    </para>
-    <para>When HBase is used for OLAP operations, it's preferable to set it up in a hardened way like configuring the ZooKeeper session
-          timeout higher and giving more memory to the MemStores (the argument being that the Block Cache won't be used much
-          since the workloads are usually long scans).
-    </para>
+  <section
+    xml:id="perf.hbase.mr.cluster">
+    <title>Collocating HBase and MapReduce</title>
+    <para>It is often recommended to have different clusters for HBase and MapReduce. A better
+      qualification of this is: don't collocate a HBase that serves live requests with a heavy MR
+      workload. OLTP and OLAP-optimized systems have conflicting requirements and one will lose to
+      the other, usually the former. For example, short latency-sensitive disk reads will have to
+      wait in line behind longer reads that are trying to squeeze out as much throughput as
+      possible. MR jobs that write to HBase will also generate flushes and compactions, which will
+      in turn invalidate blocks in the <xref
+        linkend="block.cache" />. </para>
+    <para>If you need to process the data from your live HBase cluster in MR, you can ship the
+      deltas with <xref
+        linkend="copy.table" /> or use replication to get the new data in real time on the OLAP
+      cluster. In the worst case, if you really need to collocate both, set MR to use less Map and
+      Reduce slots than you'd normally configure, possibly just one. </para>
+    <para>When HBase is used for OLAP operations, it's preferable to set it up in a hardened way
+      like configuring the ZooKeeper session timeout higher and giving more memory to the MemStores
+      (the argument being that the Block Cache won't be used much since the workloads are usually
+      long scans). </para>
   </section>
 
-  <section xml:id="perf.casestudy"><title>Case Studies</title>
-      <para>For Performance and Troubleshooting Case Studies, see <xref linkend="casestudies"/>.
-      </para>
+  <section
+    xml:id="perf.casestudy">
+    <title>Case Studies</title>
+    <para>For Performance and Troubleshooting Case Studies, see <xref
+        linkend="casestudies" />. </para>
   </section>
 </chapter>


Mime
View raw message