hbase-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bus...@apache.org
Subject [48/51] [partial] hbase git commit: Published site at e73a9594c218ed969a2f5b0b356d7b8d0e1474c0.
Date Thu, 26 Nov 2015 04:30:13 GMT
http://git-wip-us.apache.org/repos/asf/hbase/blob/a986dfe6/book.html
----------------------------------------------------------------------
diff --git a/book.html b/book.html
index 59f045a..6a17bb0 100644
--- a/book.html
+++ b/book.html
@@ -250,7 +250,7 @@
 <li><a href="#_junit">147. JUnit</a></li>
 <li><a href="#_mockito">148. Mockito</a></li>
 <li><a href="#_mrunit">149. MRUnit</a></li>
-<li><a href="#_integration_testing_with_a_hbase_mini_cluster">150. Integration Testing with a HBase Mini-Cluster</a></li>
+<li><a href="#_integration_testing_with_an_hbase_mini_cluster">150. Integration Testing with an HBase Mini-Cluster</a></li>
 </ul>
 </li>
 <li><a href="#zookeeper">ZooKeeper</a>
@@ -1446,7 +1446,7 @@ See <a href="#loopback.ip">Loopback IP</a> for more details.</p>
 <p>Another related setting is the number of processes a user is allowed to run at once. In Linux and Unix, the number of processes is set using the <code>ulimit -u</code> command. This should not be confused with the <code>nproc</code> command, which controls the number of CPUs available to a given user. Under load, a <code>ulimit -u</code> that is too low can cause OutOfMemoryError exceptions. See Jack Levin&#8217;s major HDFS issues thread on the hbase-users mailing list, from 2011.</p>
 </div>
 <div class="paragraph">
-<p>Configuring the maximum number of file descriptors and processes for the user who is running the HBase process is an operating system configuration, rather than an HBase configuration. It is also important to be sure that the settings are changed for the user that actually runs HBase. To see which user started HBase, and that user&#8217;s ulimit configuration, look at the first line of the HBase log for that instance. A useful read setting config on you hadoop cluster is Aaron Kimballs' Configuration Parameters: What can you just ignore?</p>
+<p>Configuring the maximum number of file descriptors and processes for the user who is running the HBase process is an operating system configuration, rather than an HBase configuration. It is also important to be sure that the settings are changed for the user that actually runs HBase. To see which user started HBase, and that user&#8217;s ulimit configuration, look at the first line of the HBase log for that instance. A useful read setting config on your hadoop cluster is Aaron Kimball&#8217;s Configuration Parameters: What can you just ignore?</p>
 </div>
 <div class="exampleblock">
 <div class="title">Example 6. <code>ulimit</code> Settings on Ubuntu</div>
@@ -1876,7 +1876,7 @@ Zookeeper binds to a well known port so clients may talk to HBase.</p>
 <div class="sect2">
 <h3 id="_distributed"><a class="anchor" href="#_distributed"></a>5.2. Distributed</h3>
 <div class="paragraph">
-<p>Distributed mode can be subdivided into distributed but all daemons run on a single node&#8201;&#8212;&#8201;a.k.a <em>pseudo-distributed</em>&#8201;&#8212;&#8201;and <em>fully-distributed</em> where the daemons are spread across all nodes in the cluster.
+<p>Distributed mode can be subdivided into distributed but all daemons run on a single node&#8201;&#8212;&#8201;a.k.a. <em>pseudo-distributed</em>&#8201;&#8212;&#8201;and <em>fully-distributed</em> where the daemons are spread across all nodes in the cluster.
 The <em>pseudo-distributed</em> vs. <em>fully-distributed</em> nomenclature comes from Hadoop.</p>
 </div>
 <div class="paragraph">
@@ -2049,7 +2049,7 @@ Check them out especially if HBase had trouble starting.</p>
 </div>
 <div class="paragraph">
 <p>HBase also puts up a UI listing vital attributes.
-By default it&#8217;s deployed on the Master host at port 16010 (HBase RegionServers listen on port 16020 by default and put up an informational HTTP server at port 16030). If the Master is running on a host named <code>master.example.org</code> on the default port, point your browser at <em><a href="http://master.example.org:16010" class="bare">http://master.example.org:16010</a></em> to see the web interface.</p>
+By default it&#8217;s deployed on the Master host at port 16010 (HBase RegionServers listen on port 16020 by default and put up an informational HTTP server at port 16030). If the Master is running on a host named <code>master.example.org</code> on the default port, point your browser at http://master.example.org:16010 to see the web interface.</p>
 </div>
 <div class="paragraph">
 <p>Prior to HBase 0.98 the master UI was deployed on port 60010, and the HBase RegionServers UI on port 60030.</p>
@@ -2521,7 +2521,7 @@ Configuration that it is thought rare anyone would change can exist only in code
 <dd>
 <div class="paragraph">
 <div class="title">Description</div>
-<p>Maximum size of all memstores in a region server before new updates are blocked and flushes are forced. Defaults to 40% of heap (0.4). Updates are blocked and flushes are forced until size of all memstores in a region server hits hbase.regionserver.global.memstore.size.lower.limit. The default value in this configuration has been intentionally left emtpy in order to honor the old hbase.regionserver.global.memstore.upperLimit property if present.</p>
+<p>Maximum size of all memstores in a region server before new updates are blocked and flushes are forced. Defaults to 40% of heap (0.4). Updates are blocked and flushes are forced until size of all memstores in a region server hits hbase.regionserver.global.memstore.size.lower.limit. The default value in this configuration has been intentionally left empty in order to honor the old hbase.regionserver.global.memstore.upperLimit property if present.</p>
 </div>
 <div class="paragraph">
 <div class="title">Default</div>
@@ -2536,7 +2536,7 @@ Configuration that it is thought rare anyone would change can exist only in code
 <dd>
 <div class="paragraph">
 <div class="title">Description</div>
-<p>Maximum size of all memstores in a region server before flushes are forced. Defaults to 95% of hbase.regionserver.global.memstore.size (0.95). A 100% value for this value causes the minimum possible flushing to occur when updates are blocked due to memstore limiting. The default value in this configuration has been intentionally left emtpy in order to honor the old hbase.regionserver.global.memstore.lowerLimit property if present.</p>
+<p>Maximum size of all memstores in a region server before flushes are forced. Defaults to 95% of hbase.regionserver.global.memstore.size (0.95). A 100% value for this value causes the minimum possible flushing to occur when updates are blocked due to memstore limiting. The default value in this configuration has been intentionally left empty in order to honor the old hbase.regionserver.global.memstore.lowerLimit property if present.</p>
 </div>
 <div class="paragraph">
 <div class="title">Default</div>
@@ -2641,7 +2641,7 @@ Configuration that it is thought rare anyone would change can exist only in code
 <dd>
 <div class="paragraph">
 <div class="title">Description</div>
-<p>ZooKeeper session timeout in milliseconds. It is used in two different ways. First, this value is used in the ZK client that HBase uses to connect to the ensemble. It is also used by HBase when it starts a ZK server and it is passed as the 'maxSessionTimeout'. See <a href="http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions" class="bare">http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions</a>. For example, if a HBase region server connects to a ZK ensemble that&#8217;s also managed by HBase, then the session timeout will be the one specified by this configuration. But, a region server that connects to an ensemble managed with a different configuration will be subjected that ensemble&#8217;s maxSessionTimeout. So, even though HBase might propose using 90 seconds, the ensemble can have a max timeout lower than this and it will take precedence. The current default that ZK ships with is 40 seconds, which is lower
  than HBase&#8217;s.</p>
+<p>ZooKeeper session timeout in milliseconds. It is used in two different ways. First, this value is used in the ZK client that HBase uses to connect to the ensemble. It is also used by HBase when it starts a ZK server and it is passed as the 'maxSessionTimeout'. See <a href="http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions" class="bare">http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions</a>. For example, if an HBase region server connects to a ZK ensemble that&#8217;s also managed by HBase, then the session timeout will be the one specified by this configuration. But, a region server that connects to an ensemble managed with a different configuration will be subjected that ensemble&#8217;s maxSessionTimeout. So, even though HBase might propose using 90 seconds, the ensemble can have a max timeout lower than this and it will take precedence. The current default that ZK ships with is 40 seconds, which is lowe
 r than HBase&#8217;s.</p>
 </div>
 <div class="paragraph">
 <div class="title">Default</div>
@@ -2656,7 +2656,7 @@ Configuration that it is thought rare anyone would change can exist only in code
 <dd>
 <div class="paragraph">
 <div class="title">Description</div>
-<p>Root ZNode for HBase in ZooKeeper. All of HBase&#8217;s ZooKeeper files that are configured with a relative path will go under this node. By default, all of HBase&#8217;s ZooKeeper file path are configured with a relative path, so they will all go under this directory unless changed.</p>
+<p>Root ZNode for HBase in ZooKeeper. All of HBase&#8217;s ZooKeeper files that are configured with a relative path will go under this node. By default, all of HBase&#8217;s ZooKeeper file paths are configured with a relative path, so they will all go under this directory unless changed.</p>
 </div>
 <div class="paragraph">
 <div class="title">Default</div>
@@ -3087,7 +3087,7 @@ Configuration that it is thought rare anyone would change can exist only in code
 <dd>
 <div class="paragraph">
 <div class="title">Description</div>
-<p>How many time to retry attempting to write a version file before just aborting. Each attempt is seperated by the hbase.server.thread.wakefrequency milliseconds.</p>
+<p>How many times to retry attempting to write a version file before just aborting. Each attempt is separated by the hbase.server.thread.wakefrequency milliseconds.</p>
 </div>
 <div class="paragraph">
 <div class="title">Default</div>
@@ -3312,7 +3312,7 @@ Configuration that it is thought rare anyone would change can exist only in code
 <dd>
 <div class="paragraph">
 <div class="title">Description</div>
-<p>A StoreFile (or a selection of StoreFiles, when using ExploringCompactionPolicy) smaller than this size will always be eligible for minor compaction. HFiles this size or larger are evaluated by hbase.hstore.compaction.ratio to determine if they are eligible. Because this limit represents the "automatic include"limit for all StoreFiles smaller than this value, this value may need to be reduced in write-heavy environments where many StoreFiles in the 1-2 MB range are being flushed, because every StoreFile will be targeted for compaction and the resulting StoreFiles may still be under the minimum size and require further compaction. If this parameter is lowered, the ratio check is triggered more quickly. This addressed some issues seen in earlier versions of HBase but changing this parameter is no longer necessary in most situations. Default: 128 MB expressed in bytes.</p>
+<p>A StoreFile (or a selection of StoreFiles, when using ExploringCompactionPolicy) smaller than this size will always be eligible for minor compaction. HFiles this size or larger are evaluated by hbase.hstore.compaction.ratio to determine if they are eligible. Because this limit represents the "automatic include" limit for all StoreFiles smaller than this value, this value may need to be reduced in write-heavy environments where many StoreFiles in the 1-2 MB range are being flushed, because every StoreFile will be targeted for compaction and the resulting StoreFiles may still be under the minimum size and require further compaction. If this parameter is lowered, the ratio check is triggered more quickly. This addressed some issues seen in earlier versions of HBase but changing this parameter is no longer necessary in most situations. Default: 128 MB expressed in bytes.</p>
 </div>
 <div class="paragraph">
 <div class="title">Default</div>
@@ -3417,7 +3417,7 @@ Configuration that it is thought rare anyone would change can exist only in code
 <dd>
 <div class="paragraph">
 <div class="title">Description</div>
-<p>There are two different thread pools for compactions, one for large compactions and the other for small compactions. This helps to keep compaction of lean tables (such ashbase:meta) fast. If a compaction is larger than this threshold, it goes into the large compaction pool. In most cases, the default value is appropriate. Default: 2 x hbase.hstore.compaction.max x hbase.hregion.memstore.flush.size (which defaults to 128MB). The value field assumes that the value of hbase.hregion.memstore.flush.size is unchanged from the default.</p>
+<p>There are two different thread pools for compactions, one for large compactions and the other for small compactions. This helps to keep compaction of lean tables (such as hbase:meta) fast. If a compaction is larger than this threshold, it goes into the large compaction pool. In most cases, the default value is appropriate. Default: 2 x hbase.hstore.compaction.max x hbase.hregion.memstore.flush.size (which defaults to 128MB). The value field assumes that the value of hbase.hregion.memstore.flush.size is unchanged from the default.</p>
 </div>
 <div class="paragraph">
 <div class="title">Default</div>
@@ -4002,7 +4002,7 @@ Configuration that it is thought rare anyone would change can exist only in code
 <dd>
 <div class="paragraph">
 <div class="title">Description</div>
-<p>Set to true to skip the 'hbase.defaults.for.version' check. Setting this to true can be useful in contexts other than the other side of a maven generation; i.e. running in an ide. You&#8217;ll want to set this boolean to true to avoid seeing the RuntimException complaint: "hbase-default.xml file seems to be for and old version of HBase (\${hbase.version}), this version is X.X.X-SNAPSHOT"</p>
+<p>Set to true to skip the 'hbase.defaults.for.version' check. Setting this to true can be useful in contexts other than the other side of a maven generation; i.e. running in an IDE. You&#8217;ll want to set this boolean to true to avoid seeing the RuntimeException complaint: "hbase-default.xml file seems to be for and old version of HBase (\${hbase.version}), this version is X.X.X-SNAPSHOT"</p>
 </div>
 <div class="paragraph">
 <div class="title">Default</div>
@@ -4197,7 +4197,7 @@ Configuration that it is thought rare anyone would change can exist only in code
 <dd>
 <div class="paragraph">
 <div class="title">Description</div>
-<p>FS Permissions for the root directory in a secure(kerberos) setup. When master starts, it creates the rootdir with this permissions or sets the permissions if it does not match.</p>
+<p>FS Permissions for the root directory in a secure (kerberos) setup. When master starts, it creates the rootdir with this permissions or sets the permissions if it does not match.</p>
 </div>
 <div class="paragraph">
 <div class="title">Default</div>
@@ -4722,7 +4722,7 @@ Configuration that it is thought rare anyone would change can exist only in code
 <dd>
 <div class="paragraph">
 <div class="title">Description</div>
-<p>Whether asynchronous WAL replication to the secondary region replicas is enabled or not. If this is enabled, a replication peer named "region_replica_replication" will be created which will tail the logs and replicate the mutatations to region replicas for tables that have region replication &gt; 1. If this is enabled once, disabling this replication also requires disabling the replication peer using shell or ReplicationAdmin java class. Replication to secondary region replicas works over standard inter-cluster replication. So replication, if disabled explicitly, also has to be enabled by setting "hbase.replication" to true for this feature to work.</p>
+<p>Whether asynchronous WAL replication to the secondary region replicas is enabled or not. If this is enabled, a replication peer named "region_replica_replication" will be created which will tail the logs and replicate the mutations to region replicas for tables that have region replication &gt; 1. If this is enabled once, disabling this replication also requires disabling the replication peer using shell or ReplicationAdmin java class. Replication to secondary region replicas works over standard inter-cluster replication. So replication, if disabled explicitly, also has to be enabled by setting "hbase.replication" to true for this feature to work.</p>
 </div>
 <div class="paragraph">
 <div class="title">Default</div>
@@ -5097,7 +5097,7 @@ Thus clients require the location of the ZooKeeper ensemble before they can do a
 Usually this the ensemble location is kept out in the <em>hbase-site.xml</em> and is picked up by the client from the <code>CLASSPATH</code>.</p>
 </div>
 <div class="paragraph">
-<p>If you are configuring an IDE to run a HBase client, you should include the <em>conf/</em> directory on your classpath so <em>hbase-site.xml</em> settings can be found (or add <em>src/test/resources</em> to pick up the hbase-site.xml used by tests).</p>
+<p>If you are configuring an IDE to run an HBase client, you should include the <em>conf/</em> directory on your classpath so <em>hbase-site.xml</em> settings can be found (or add <em>src/test/resources</em> to pick up the hbase-site.xml used by tests).</p>
 </div>
 <div class="paragraph">
 <p>Minimally, a client of HBase needs several libraries in its <code>CLASSPATH</code> when connecting to a cluster, including:</p>
@@ -5475,7 +5475,7 @@ It is configured via <code>hbase.balancer.period</code> and defaults to 300000 (
 <h4 id="disabling.blockcache"><a class="anchor" href="#disabling.blockcache"></a>9.3.2. Disabling Blockcache</h4>
 <div class="paragraph">
 <p>Do not turn off block cache (You&#8217;d do it by setting <code>hbase.block.cache.size</code> to zero). Currently we do not do well if you do this because the RegionServer will spend all its time loading HFile indices over and over again.
-If your working set it such that block cache does you no good, at least size the block cache such that HFile indices will stay up in the cache (you can get a rough idea on the size you need by surveying RegionServer UIs; you&#8217;ll see index block size accounted near the top of the webpage).</p>
+If your working set is such that block cache does you no good, at least size the block cache such that HFile indices will stay up in the cache (you can get a rough idea on the size you need by surveying RegionServer UIs; you&#8217;ll see index block size accounted near the top of the webpage).</p>
 </div>
 </div>
 <div class="sect3">
@@ -5490,7 +5490,7 @@ You might also see the graphs on the tail of <a href="https://issues.apache.org/
 <h4 id="mttr"><a class="anchor" href="#mttr"></a>9.3.4. Better Mean Time to Recover (MTTR)</h4>
 <div class="paragraph">
 <p>This section is about configurations that will make servers come back faster after a fail.
-See the Deveraj Das an Nicolas Liochon blog post <a href="http://hortonworks.com/blog/introduction-to-hbase-mean-time-to-recover-mttr/">Introduction to HBase Mean Time to Recover (MTTR)</a> for a brief introduction.</p>
+See the Deveraj Das and Nicolas Liochon blog post <a href="http://hortonworks.com/blog/introduction-to-hbase-mean-time-to-recover-mttr/">Introduction to HBase Mean Time to Recover (MTTR)</a> for a brief introduction.</p>
 </div>
 <div class="paragraph">
 <p>The issue <a href="https://issues.apache.org/jira/browse/HBASE-8389">HBASE-8354 forces Namenode into loop with lease recovery requests</a> is messy but has a bunch of good discussion toward the end on low timeouts and how to effect faster recovery including citation of fixes added to HDFS. Read the Varun Sharma comments.
@@ -5679,7 +5679,7 @@ To enable the HBase JMX implementation on Master, you also need to add below pro
 <div class="listingblock">
 <div class="content">
 <pre class="CodeRay highlight"><code data-lang="xml"><span class="tag">&lt;property&gt;</span>
-  <span class="tag">&lt;ame&gt;</span>hbase.coprocessor.master.classes<span class="tag">&lt;/name&gt;</span>
+  <span class="tag">&lt;name&gt;</span>hbase.coprocessor.master.classes<span class="tag">&lt;/name&gt;</span>
   <span class="tag">&lt;value&gt;</span>org.apache.hadoop.hbase.JMXListener<span class="tag">&lt;/value&gt;</span>
 <span class="tag">&lt;/property&gt;</span></code></pre>
 </div>
@@ -5996,7 +5996,7 @@ It may be possible to skip across versions&#8201;&#8212;&#8201;for example go fr
 <dl>
 <dt class="hdlist1">HBase LimitedPrivate API</dt>
 <dd>
-<p>LimitedPrivate annotation comes with a set of target consumers for the interfaces. Those consumers are coprocessors, phoenix, replication endpoint implemnetations or similar. At this point, HBase only guarantees source and binary compatibility for these interfaces between patch versions.</p>
+<p>LimitedPrivate annotation comes with a set of target consumers for the interfaces. Those consumers are coprocessors, phoenix, replication endpoint implementations or similar. At this point, HBase only guarantees source and binary compatibility for these interfaces between patch versions.</p>
 </dd>
 </dl>
 </div>
@@ -6033,7 +6033,7 @@ It may be possible to skip across versions&#8201;&#8212;&#8201;for example go fr
 <p>A rolling upgrade is the process by which you update the servers in your cluster a server at a time. You can rolling upgrade across HBase versions if they are binary or wire compatible. See <a href="#hbase.rolling.restart">Rolling Upgrade Between Versions that are Binary/Wire Compatible</a> for more on what this means. Coarsely, a rolling upgrade is a graceful stop each server, update the software, and then restart. You do this for each server in the cluster. Usually you upgrade the Master first and then the RegionServers. See <a href="#rolling">Rolling Restart</a> for tools that can help use the rolling upgrade process.</p>
 </div>
 <div class="paragraph">
-<p>For example, in the below, HBase was symlinked to the actual HBase install. On upgrade, before running a rolling restart over the cluser, we changed the symlink to point at the new HBase software version and then ran</p>
+<p>For example, in the below, HBase was symlinked to the actual HBase install. On upgrade, before running a rolling restart over the cluster, we changed the symlink to point at the new HBase software version and then ran</p>
 </div>
 <div class="listingblock">
 <div class="content">
@@ -6083,7 +6083,7 @@ ports.</p>
 </div>
 <div id="upgrade1.0.hbase.bucketcache.percentage.in.combinedcache" class="paragraph">
 <div class="title">hbase.bucketcache.percentage.in.combinedcache configuration has been REMOVED</div>
-<p>You may have made use of this configuration if you are using BucketCache. If NOT using BucketCache, this change does not effect you. Its removal means that your L1 LruBlockCache is now sized using <code>hfile.block.cache.size</code>&#8201;&#8212;&#8201;i.e. the way you would size the on-heap L1 LruBlockCache if you were NOT doing BucketCache&#8201;&#8212;&#8201;and the BucketCache size is not whatever the setting for <code>hbase.bucketcache.size</code> is. You may need to adjust configs to get the LruBlockCache and BucketCache sizes set to what they were in 0.98.x and previous. If you did not set this config., its default value was 0.9. If you do nothing, your BucketCache will increase in size by 10%. Your L1 LruBlockCache will become <code>hfile.block.cache.size</code> times your java heap size (<code>hfile.block.cache.size</code> is a float between 0.0 and 1.0). To read more, see <a href="https://issues.apache.org/jira/browse/HBASE-11520">HBASE-11520 Simplify offheap cache conf
 ig by removing the confusing "hbase.bucketcache.percentage.in.combinedcache"</a>.</p>
+<p>You may have made use of this configuration if you are using BucketCache. If NOT using BucketCache, this change does not affect you. Its removal means that your L1 LruBlockCache is now sized using <code>hfile.block.cache.size</code>&#8201;&#8212;&#8201;i.e. the way you would size the on-heap L1 LruBlockCache if you were NOT doing BucketCache&#8201;&#8212;&#8201;and the BucketCache size is not whatever the setting for <code>hbase.bucketcache.size</code> is. You may need to adjust configs to get the LruBlockCache and BucketCache sizes set to what they were in 0.98.x and previous. If you did not set this config., its default value was 0.9. If you do nothing, your BucketCache will increase in size by 10%. Your L1 LruBlockCache will become <code>hfile.block.cache.size</code> times your java heap size (<code>hfile.block.cache.size</code> is a float between 0.0 and 1.0). To read more, see <a href="https://issues.apache.org/jira/browse/HBASE-11520">HBASE-11520 Simplify offheap cache conf
 ig by removing the confusing "hbase.bucketcache.percentage.in.combinedcache"</a>.</p>
 </div>
 <div id="hbase-12068" class="paragraph">
 <div class="title">If you have your own customer filters.</div>
@@ -6371,7 +6371,7 @@ Successfully completed Log splitting</pre>
 <div class="sect2">
 <h3 id="upgrade0.94"><a class="anchor" href="#upgrade0.94"></a>12.6. Upgrading from 0.92.x to 0.94.x</h3>
 <div class="paragraph">
-<p>We used to think that 0.92 and 0.94 were interface compatible and that you can do a rolling upgrade between these versions but then we figured that <a href="https://issues.apache.org/jira/browse/HBASE-5357">HBASE-5357 Use builder pattern in HColumnDescriptor</a> changed method signatures so rather than return <code>void</code> they instead return <code>HColumnDescriptor</code>. This will throw`java.lang.NoSuchMethodError: org.apache.hadoop.hbase.HColumnDescriptor.setMaxVersions(I)V` so 0.92 and 0.94 are NOT compatible. You cannot do a rolling upgrade between them.</p>
+<p>We used to think that 0.92 and 0.94 were interface compatible and that you can do a rolling upgrade between these versions but then we figured that <a href="https://issues.apache.org/jira/browse/HBASE-5357">HBASE-5357 Use builder pattern in HColumnDescriptor</a> changed method signatures so rather than return <code>void</code> they instead return <code>HColumnDescriptor</code>. This will throw <code>java.lang.NoSuchMethodError: org.apache.hadoop.hbase.HColumnDescriptor.setMaxVersions(I)V</code> so 0.92 and 0.94 are NOT compatible. You cannot do a rolling upgrade between them.</p>
 </div>
 </div>
 <div class="sect2">
@@ -6567,7 +6567,7 @@ Spawning HBase Shell commands in this way is slow, so keep that in mind when you
 <div class="title">Example 8. Passing Commands to the HBase Shell</div>
 <div class="content">
 <div class="paragraph">
-<p>You can pass commands to the HBase Shell in non-interactive mode (see <a href="#hbasee.shell.noninteractive">hbasee.shell.noninteractive</a>) using the <code>echo</code> command and the <code>|</code> (pipe) operator.
+<p>You can pass commands to the HBase Shell in non-interactive mode (see <a href="#hbase.shell.noninteractive">hbase.shell.noninteractive</a>) using the <code>echo</code> command and the <code>|</code> (pipe) operator.
 Be sure to escape characters in the HBase commands which would otherwise be interpreted by the shell.
 Some debug-level output has been truncated from the example below.</p>
 </div>
@@ -7205,7 +7205,7 @@ This abstraction lays the groundwork for upcoming multi-tenancy related features
 <div class="ulist">
 <ul>
 <li>
-<p>Quota Management (<a href="https://issues.apache.org/jira/browse/HBASE-8410">HBASE-8410</a>) - Restrict the amount of resources (ie regions, tables) a namespace can consume.</p>
+<p>Quota Management (<a href="https://issues.apache.org/jira/browse/HBASE-8410">HBASE-8410</a>) - Restrict the amount of resources (i.e. regions, tables) a namespace can consume.</p>
 </li>
 <li>
 <p>Namespace Security Administration (<a href="https://issues.apache.org/jira/browse/HBASE-9206">HBASE-9206</a>) - Provide another level of security administration for tenants.</p>
@@ -7316,7 +7316,7 @@ For example, the columns <em>courses:history</em> and <em>courses:math</em> are
 The colon character (<code>:</code>) delimits the column family from the column family qualifier.
 The column family prefix must be composed of <em>printable</em> characters.
 The qualifying tail, the column family <em>qualifier</em>, can be made of any arbitrary bytes.
-Column families must be declared up front at schema definition time whereas columns do not need to be defined at schema time but can be conjured on the fly while the table is up an running.</p>
+Column families must be declared up front at schema definition time whereas columns do not need to be defined at schema time but can be conjured on the fly while the table is up and running.</p>
 </div>
 <div class="paragraph">
 <p>Physically, all column family members are stored together on the filesystem.
@@ -7350,7 +7350,7 @@ Gets are executed via <a href="http://hbase.apache.org/apidocs/org/apache/hadoop
 <div class="sect2">
 <h3 id="_put"><a class="anchor" href="#_put"></a>26.2. Put</h3>
 <div class="paragraph">
-<p><a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Put.html">Put</a> either adds new rows to a table (if the key is new) or can update existing rows (if the key already exists). Puts are executed via <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#put(org.apache.hadoop.hbase.client.Put)">Table.put</a> (writeBuffer) or link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#batch(java.util.List, java.lang.Object[])[Table.batch] (non-writeBuffer).</p>
+<p><a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Put.html">Put</a> either adds new rows to a table (if the key is new) or can update existing rows (if the key already exists). Puts are executed via <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#put(org.apache.hadoop.hbase.client.Put)">Table.put</a> (writeBuffer) or <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#batch(java.util.List,%20java.lang.Object%5B%5D)">Table.batch</a> (non-writeBuffer).</p>
 </div>
 </div>
 <div class="sect2">
@@ -7924,7 +7924,7 @@ The pile-up on a single region brought on by monotonically increasing keys can b
 </div>
 <div class="paragraph">
 <p>If you do need to upload time series data into HBase, you should study <a href="http://opentsdb.net/">OpenTSDB</a> as a successful example.
-It has a page describing the link: <a href="http://opentsdb.net/schema.html">schema</a> it uses in HBase.
+It has a page describing the <a href="http://opentsdb.net/schema.html">schema</a> it uses in HBase.
 The key format in OpenTSDB is effectively [metric_type][event_timestamp], which would appear at first glance to contradict the previous advice about not using a timestamp as the key.
 However, the difference is that the timestamp is not in the <em>lead</em> position of the key, and the design assumption is that there are dozens or hundreds (or more) of different metric types.
 Thus, even with a continual stream of input data with a mix of metric types, the Puts are distributed across various points of regions in the table.</p>
@@ -8104,8 +8104,8 @@ As an example of why this is important, consider the example of using displayabl
 <div class="paragraph">
 <p>The problem is that all the data is going to pile up in the first 2 regions and the last region thus creating a "lumpy" (and possibly "hot") region problem.
 To understand why, refer to an <a href="http://www.asciitable.com">ASCII Table</a>.
-'0' is byte 48, and 'f' is byte 102, but there is a huge gap in byte values (bytes 58 to 96) that will <em>never appear in this keyspace</em> because the only values are [0-9] and [a-f]. Thus, the middle regions regions will never be used.
-To make pre-spliting work with this example keyspace, a custom definition of splits (i.e., and not relying on the built-in split method) is required.</p>
+'0' is byte 48, and 'f' is byte 102, but there is a huge gap in byte values (bytes 58 to 96) that will <em>never appear in this keyspace</em> because the only values are [0-9] and [a-f]. Thus, the middle regions will never be used.
+To make pre-splitting work with this example keyspace, a custom definition of splits (i.e., and not relying on the built-in split method) is required.</p>
 </div>
 <div class="paragraph">
 <p>Lesson #1: Pre-splitting tables is generally a best practice, but you need to pre-split them in such a way that all the regions are accessible in the keyspace.
@@ -8184,7 +8184,7 @@ The minimum number of row versions parameter is used together with the time-to-l
 Input could be strings, numbers, complex objects, or even images as long as they can rendered as bytes.</p>
 </div>
 <div class="paragraph">
-<p>There are practical limits to the size of values (e.g., storing 10-50MB objects in HBase would probably be too much to ask); search the mailling list for conversations on this topic.
+<p>There are practical limits to the size of values (e.g., storing 10-50MB objects in HBase would probably be too much to ask); search the mailing list for conversations on this topic.
 All rows in HBase conform to the <a href="#datamodel">Data Model</a>, and that includes versioning.
 Take that into consideration when making your design, as well as block size for the ColumnFamily.</p>
 </div>
@@ -8326,7 +8326,7 @@ ROW                                              COLUMN+CELL
 <p>Notice how delete cells are let go.</p>
 </div>
 <div class="paragraph">
-<p>Now lets run the same test only with <code>KEEP_DELETED_CELLS</code> set on the table (you can do table or per-column-family):</p>
+<p>Now let&#8217;s run the same test only with <code>KEEP_DELETED_CELLS</code> set on the table (you can do table or per-column-family):</p>
 </div>
 <div class="listingblock">
 <div class="content">
@@ -8437,7 +8437,7 @@ However, don&#8217;t try a full-scan on a large table like this from an applicat
 <div class="sect2">
 <h3 id="secondary.indexes.periodic"><a class="anchor" href="#secondary.indexes.periodic"></a>40.2. Periodic-Update Secondary Index</h3>
 <div class="paragraph">
-<p>A secondary index could be created in an other table which is periodically updated via a MapReduce job.
+<p>A secondary index could be created in another table which is periodically updated via a MapReduce job.
 The job could be executed intra-day, but depending on load-strategy it could still potentially be out of sync with the main data table.</p>
 </div>
 <div class="paragraph">
@@ -8673,7 +8673,7 @@ by using an
 <div class="paragraph">
 <p>This effectively is the OpenTSDB approach.
 What OpenTSDB does is re-write data and pack rows into columns for certain time-periods.
-For a detailed explanation, see: link:http://opentsdb.net/schema.html, and
+For a detailed explanation, see: <a href="http://opentsdb.net/schema.html" class="bare">http://opentsdb.net/schema.html</a>, and
 <a href="http://www.cloudera.com/content/cloudera/en/resources/library/hbasecon/video-hbasecon-2012-lessons-learned-from-opentsdb.html">Lessons Learned from OpenTSDB</a>
 from HBaseCon2012.</p>
 </div>
@@ -8752,7 +8752,7 @@ There are two core record-types being ingested: a Customer record type, and Orde
 </div>
 </div>
 <div class="paragraph">
-<p>for a ORDER table.
+<p>for an ORDER table.
 However, there are more design decisions to make: are the <em>raw</em> values the best choices for rowkeys?</p>
 </div>
 <div class="paragraph">
@@ -9022,10 +9022,10 @@ For example, the ORDER table&#8217;s rowkey was described above: <a href="#schem
 <div class="paragraph">
 <p>There are many options here: JSON, XML, Java Serialization, Avro, Hadoop Writables, etc.
 All of them are variants of the same approach: encode the object graph to a byte-array.
-Care should be taken with this approach to ensure backward compatibilty in case the object model changes such that older persisted structures can still be read back out of HBase.</p>
+Care should be taken with this approach to ensure backward compatibility in case the object model changes such that older persisted structures can still be read back out of HBase.</p>
 </div>
 <div class="paragraph">
-<p>Pros are being able to manage complex object graphs with minimal I/O (e.g., a single HBase Get per Order in this example), but the cons include the aforementioned warning about backward compatiblity of serialization, language dependencies of serialization (e.g., Java Serialization only works with Java clients), the fact that you have to deserialize the entire object to get any piece of information inside the BLOB, and the difficulty in getting frameworks like Hive to work with custom objects like this.</p>
+<p>Pros are being able to manage complex object graphs with minimal I/O (e.g., a single HBase Get per Order in this example), but the cons include the aforementioned warning about backward compatibility of serialization, language dependencies of serialization (e.g., Java Serialization only works with Java clients), the fact that you have to deserialize the entire object to get any piece of information inside the BLOB, and the difficulty in getting frameworks like Hive to work with custom objects like this.</p>
 </div>
 </div>
 </div>
@@ -9040,7 +9040,7 @@ These are general guidelines and not laws - each application must consider its o
 <h4 id="schema.smackdown.rowsversions"><a class="anchor" href="#schema.smackdown.rowsversions"></a>42.4.1. Rows vs. Versions</h4>
 <div class="paragraph">
 <p>A common question is whether one should prefer rows or HBase&#8217;s built-in-versioning.
-The context is typically where there are "a lot" of versions of a row to be retained (e.g., where it is significantly above the HBase default of 1 max versions). The rows-approach would require storing a timestamp in some portion of the rowkey so that they would not overwite with each successive update.</p>
+The context is typically where there are "a lot" of versions of a row to be retained (e.g., where it is significantly above the HBase default of 1 max versions). The rows-approach would require storing a timestamp in some portion of the rowkey so that they would not overwrite with each successive update.</p>
 </div>
 <div class="paragraph">
 <p>Preference: Rows (generally speaking).</p>
@@ -9163,7 +9163,7 @@ I would also assume that we would have infrequent updates, but may have inserts
 <div class="paragraph">
 <p>Your two options mirror a common question people have when designing HBase schemas: should I go "tall" or "wide"? Your first schema is "tall": each row represents one value for one user, and so there are many rows in the table for each user; the row key is user + valueid, and there would be (presumably) a single column qualifier that means "the value". This is great if you want to scan over rows in sorted order by row key (thus my question above, about whether these ids are sorted correctly). You can start a scan at any user+valueid, read the next 30, and be done.
 What you&#8217;re giving up is the ability to have transactional guarantees around all the rows for one user, but it doesn&#8217;t sound like you need that.
-Doing it this way is generally recommended (see here link:http://hbase.apache.org/book.html#schema.smackdown).</p>
+Doing it this way is generally recommended (see here <a href="http://hbase.apache.org/book.html#schema.smackdown" class="bare">http://hbase.apache.org/book.html#schema.smackdown</a>).</p>
 </div>
 <div class="paragraph">
 <p>Your second option is "wide": you store a bunch of values in one row, using different qualifiers (where the qualifier is the valueid). The simple way to do that would be to just store ALL values for one user in a single row.
@@ -9172,7 +9172,7 @@ The client has methods that allow you to get specific slices of columns.</p>
 </div>
 <div class="paragraph">
 <p>Note that neither case fundamentally uses more disk space than the other; you&#8217;re just "shifting" part of the identifying information for a value either to the left (into the row key, in option one) or to the right (into the column qualifiers in option 2). Under the covers, every key/value still stores the whole row key, and column family name.
-(If this is a bit confusing, take an hour and watch Lars George&#8217;s excellent video about understanding HBase schema design: link:http://www.youtube.com/watch?v=_HLoH_PgrLk).</p>
+(If this is a bit confusing, take an hour and watch Lars George&#8217;s excellent video about understanding HBase schema design: <a href="http://www.youtube.com/watch?v=_HLoH_PgrLk" class="bare">http://www.youtube.com/watch?v=_HLoH_PgrLk</a>).</p>
 </div>
 <div class="paragraph">
 <p>A manually paginated version has lots more complexities, as you note, like having to keep track of how many things are in each page, re-shuffling if new values are inserted, etc.
@@ -9249,7 +9249,7 @@ The dependencies only need to be available on the local <code>CLASSPATH</code>.
 The following example runs the bundled HBase <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/RowCounter.html">RowCounter</a> MapReduce job against a table named <code>usertable</code>.
 If you have not set the environment variables expected in the command (the parts prefixed by a <code>$</code> sign and surrounded by curly braces), you can use the actual system paths instead.
 Be sure to use the correct version of the HBase JAR for your system.
-The backticks (<code>`</code> symbols) cause ths shell to execute the sub-commands, setting the output of <code>hbase classpath</code> (the command to dump HBase CLASSPATH) to <code>HADOOP_CLASSPATH</code>.
+The backticks (<code>`</code> symbols) cause the shell to execute the sub-commands, setting the output of <code>hbase classpath</code> (the command to dump HBase CLASSPATH) to <code>HADOOP_CLASSPATH</code>.
 This example assumes you use a BASH-compatible shell.</p>
 </div>
 <div class="listingblock">
@@ -9536,7 +9536,7 @@ That is where the logic for map-task assignment resides.</p>
 <div class="paragraph">
 <p>The following is an example of using HBase as a MapReduce source in read-only manner.
 Specifically, there is a Mapper instance but no Reducer, and nothing is being emitted from the Mapper.
-There job would be defined as follows&#8230;&#8203;</p>
+The job would be defined as follows&#8230;&#8203;</p>
 </div>
 <div class="listingblock">
 <div class="content">
@@ -9875,7 +9875,7 @@ Recognize that the more reducers that are assigned to the job, the more simultan
 <div class="sectionbody">
 <div class="paragraph">
 <p>It is generally advisable to turn off speculative execution for MapReduce jobs that use HBase as a source.
-This can either be done on a per-Job basis through properties, on on the entire cluster.
+This can either be done on a per-Job basis through properties, or on the entire cluster.
 Especially for longer running jobs, speculative execution will create duplicate map-tasks which will double-write your data to HBase; this is probably not what you want.</p>
 </div>
 <div class="paragraph">
@@ -9901,7 +9901,7 @@ way.</p>
 <span class="comment">// emits two fields: &quot;offset&quot; and &quot;line&quot;</span>
 Tap source = <span class="keyword">new</span> Hfs( <span class="keyword">new</span> TextLine(), inputFileLhs );
 
-<span class="comment">// store data in a HBase cluster</span>
+<span class="comment">// store data in an HBase cluster</span>
 <span class="comment">// accepts fields &quot;num&quot;, &quot;lower&quot;, and &quot;upper&quot;</span>
 <span class="comment">// will automatically scope incoming fields to their proper familyname, &quot;left&quot; or &quot;right&quot;</span>
 Fields keyFields = <span class="keyword">new</span> Fields( <span class="string"><span class="delimiter">&quot;</span><span class="content">num</span><span class="delimiter">&quot;</span></span> );
@@ -10354,7 +10354,7 @@ To enable REST gateway Kerberos authentication for client access, add the follow
 </div>
 <div class="paragraph">
 <p>HBase REST gateway supports different 'hbase.rest.authentication.type': simple, kerberos.
-You can also implement a custom authentication by implemening Hadoop AuthenticationHandler, then specify the full class name as 'hbase.rest.authentication.type' value.
+You can also implement a custom authentication by implementing Hadoop AuthenticationHandler, then specify the full class name as 'hbase.rest.authentication.type' value.
 For more information, refer to <a href="http://hadoop.apache.org/docs/stable/hadoop-auth/index.html">SPNEGO HTTP authentication</a>.</p>
 </div>
 </div>
@@ -10367,7 +10367,7 @@ To the HBase server, all requests are from the REST gateway user.
 The actual users are unknown.
 You can turn on the impersonation support.
 With impersonation, the REST gateway user is a proxy user.
-The HBase server knows the acutal/real user of each request.
+The HBase server knows the actual/real user of each request.
 So it can apply proper authorizations.</p>
 </div>
 <div class="paragraph">
@@ -11491,7 +11491,7 @@ Visibility labels are not currently applied for superusers.
 <tbody>
 <tr>
 <td class="tableblock halign-left valign-top"><div class="literal"><pre>fulltime</pre></div></td>
-<td class="tableblock halign-left valign-top"><p class="tableblock">Allow accesss to users associated with the fulltime label.</p></td>
+<td class="tableblock halign-left valign-top"><p class="tableblock">Allow access to users associated with the fulltime label.</p></td>
 </tr>
 <tr>
 <td class="tableblock halign-left valign-top"><div class="literal"><pre>!public</pre></div></td>
@@ -12345,7 +12345,8 @@ Technically speaking, HBase is really more a "Data Store" than "Data Base" becau
 <p>However, HBase has many features which supports both linear and modular scaling.
 HBase clusters expand by adding RegionServers that are hosted on commodity class servers.
 If a cluster expands from 10 to 20 RegionServers, for example, it doubles both in terms of storage and as well as processing capacity.
-RDBMS can scale well, but only up to a point - specifically, the size of a single database server - and for the best performance requires specialized hardware and storage devices.
+An RDBMS can scale well, but only up to a point - specifically, the size of a single database
+server - and for the best performance requires specialized hardware and storage devices.
 HBase features of note are:</p>
 </div>
 <div class="ulist">
@@ -12522,7 +12523,7 @@ If a region has both an empty start and an empty end key, it is the only region
 <div class="paragraph">
 <p>In the (hopefully unlikely) event that programmatic processing of catalog metadata
 is required, see the
-<a href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/Writables.html#getHRegionInfo%28byte[]%29">Writables</a>
+<a href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/Writables.html#getHRegionInfo%28byte%5B%5D%29">Writables</a>
 utility.</p>
 </div>
 </div>
@@ -12563,7 +12564,7 @@ Should a region be reassigned either by the master load balancer or because a Re
 <div class="sect3">
 <h4 id="_api_as_of_hbase_1_0_0"><a class="anchor" href="#_api_as_of_hbase_1_0_0"></a>63.1.1. API as of HBase 1.0.0</h4>
 <div class="paragraph">
-<p>Its been cleaned up and users are returned Interfaces to work against rather than particular types.
+<p>It&#8217;s been cleaned up and users are returned Interfaces to work against rather than particular types.
 In HBase 1.0, obtain a <code>Connection</code> object from <code>ConnectionFactory</code> and thereafter, get from it instances of <code>Table</code>, <code>Admin</code>, and <code>RegionLocator</code> on an as-need basis.
 When done, close the obtained instances.
 Finally, be sure to cleanup your <code>Connection</code> instance before exiting.
@@ -12715,7 +12716,11 @@ scan.setFilter(list);</code></pre>
 <div class="sect3">
 <h4 id="client.filter.cv.scvf"><a class="anchor" href="#client.filter.cv.scvf"></a>64.2.1. SingleColumnValueFilter</h4>
 <div class="paragraph">
-<p><a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.html">SingleColumnValueFilter</a> can be used to test column values for equivalence (<code>CompareOp.EQUAL</code>), inequality (<code>CompareOp.NOT_EQUAL</code>), or ranges (e.g., <code>CompareOp.GREATER</code>). The following is example of testing equivalence a column to a String value "my value"&#8230;&#8203;</p>
+<p>A SingleColumnValueFilter (see:
+<a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.html" class="bare">http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.html</a>)
+can be used to test column values for equivalence (<code>CompareOp.EQUAL</code>),
+inequality (<code>CompareOp.NOT_EQUAL</code>), or ranges (e.g., <code>CompareOp.GREATER</code>). The following is an
+example of testing equivalence of a column to a String value "my value"&#8230;&#8203;</p>
 </div>
 <div class="listingblock">
 <div class="content">
@@ -13228,7 +13233,8 @@ Here are others that you may have to take into account:</p>
 <dt class="hdlist1">Catalog Tables</dt>
 <dd>
 <p>The <code>-ROOT-</code> (prior to HBase 0.96, see <a href="#arch.catalog.root">arch.catalog.root</a>) and <code>hbase:meta</code> tables are forced into the block cache and have the in-memory priority which means that they are harder to evict.
-The former never uses more than a few hundreds bytes while the latter can occupy a few MBs (depending on the number of regions).</p>
+The former never uses more than a few hundred bytes while the latter can occupy a few MBs
+(depending on the number of regions).</p>
 </dd>
 <dt class="hdlist1">HFiles Indexes</dt>
 <dd>
@@ -13482,7 +13488,10 @@ For a RegionServer hosting data that can comfortably fit into cache, or if your
 <p>The RegionServer closes the parent region and marks the region as offline in its local data structures. <strong>THE SPLITTING REGION IS NOW OFFLINE.</strong> At this point, client requests coming to the parent region will throw <code>NotServingRegionException</code>. The client will retry with some backoff. The closing region is flushed.</p>
 </li>
 <li>
-<p>The  RegionServer creates region directories under the <code>.splits</code> directory, for daughter regions A and B, and creates necessary data structures. Then it splits the store files, in the sense that it creates two <a href="http://www.google.com/url?q=http%3A%2F%2Fhbase.apache.org%2Fapidocs%2Forg%2Fapache%2Fhadoop%2Fhbase%2Fio%2FReference.html&amp;sa=D&amp;sntz=1&amp;usg=AFQjCNEkCbADZ3CgKHTtGYI8bJVwp663CA">Reference</a> files per store file in the parent region. Those reference files will point to the parent regions&#8217;files.</p>
+<p>The RegionServer creates region directories under the <code>.splits</code> directory, for daughter
+regions A and B, and creates necessary data structures. Then it splits the store files,
+in the sense that it creates two Reference files per store file in the parent region.
+Those reference files will point to the parent region&#8217;s files.</p>
 </li>
 <li>
 <p>The RegionServer creates the actual region directory in HDFS, and moves the reference files for each daughter.</p>
@@ -13676,7 +13685,8 @@ After all edit files are replayed, the contents of the MemStore are written to d
 </div>
 <div class="paragraph">
 <p>If the <code>hbase.hlog.split.skip.errors</code> option is set to <code>false</code>, the default, the exception will be propagated and the split will be logged as failed.
-See <a href="https://issues.apache.org/jira/browse/HBASE-2958">HBASE-2958 When hbase.hlog.split.skip.errors is set to false, we fail the split but thats it</a>.
+See <a href="https://issues.apache.org/jira/browse/HBASE-2958">HBASE-2958 When
+hbase.hlog.split.skip.errors is set to false, we fail the split but that&#8217;s it</a>.
 We need to do more than just fail split if this flag is set.</p>
 </div>
 <div class="sect5">
@@ -13851,7 +13861,8 @@ ctime = Sat Jun 23 11:13:40 PDT 2012
 <p>Each RegionServer runs a daemon thread called the <em>split log worker</em>, which does the work to split the logs.
 The daemon thread starts when the RegionServer starts, and registers itself to watch HBase znodes.
 If any splitlog znode children change, it notifies a sleeping worker thread to wake up and grab more tasks.
-If if a worker&#8217;s current task&#8217;s node data is changed, the worker checks to see if the task has been taken by another worker.
+If a worker&#8217;s current task&#8217;s node data is changed,
+the worker checks to see if the task has been taken by another worker.
 If so, the worker thread stops work on the current task.</p>
 </div>
 <div class="paragraph">
@@ -13867,7 +13878,7 @@ At this point, the split log worker scans for another unclaimed task.</p>
 <p>It queries the task state and only takes action if the task is in `TASK_UNASSIGNED `state.</p>
 </li>
 <li>
-<p>If the task is is in <code>TASK_UNASSIGNED</code> state, the worker attempts to set the state to <code>TASK_OWNED</code> by itself.
+<p>If the task is in <code>TASK_UNASSIGNED</code> state, the worker attempts to set the state to <code>TASK_OWNED</code> by itself.
 If it fails to set the state, another worker will try to grab it.
 The split log manager will also ask all workers to rescan later if the task remains unassigned.</p>
 </li>
@@ -13886,7 +13897,7 @@ In the meantime, it starts a split task executor to do the actual work:</p>
 <p>If the worker catches an unexpected IOException, the task is set to state <code>TASK_ERR</code>.</p>
 </li>
 <li>
-<p>If the worker is shutting down, set the the task to state <code>TASK_RESIGNED</code>.</p>
+<p>If the worker is shutting down, set the task to state <code>TASK_RESIGNED</code>.</p>
 </li>
 <li>
 <p>If the task is taken by another worker, just log it.</p>
@@ -14231,7 +14242,7 @@ The master moves the region to <code>CLOSED</code> state and re-assigns it to a
 <li>
 <p>When a RegionServer is about to split a region, it notifies the master.
 The master moves the region to be split from <code>OPEN</code> to <code>SPLITTING</code> state and add the two new regions to be created to the RegionServer.
-These two regions are in <code>SPLITING_NEW</code> state initially.</p>
+These two regions are in <code>SPLITTING_NEW</code> state initially.</p>
 </li>
 <li>
 <p>After notifying the master, the RegionServer starts to split the region.
@@ -14346,8 +14357,8 @@ admin.createTable(tableDesc);
 </div>
 <div class="paragraph">
 <p>The default split policy can be overwritten using a custom
-link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/RegionSplitPolicy.html
-[RegionSplitPolicy(HBase 0.94+)]. Typically a custom split policy should extend HBase&#8217;s default split policy:
+<a href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/RegionSplitPolicy.html">RegionSplitPolicy(HBase 0.94+)</a>.
+Typically a custom split policy should extend HBase&#8217;s default split policy:
 <a href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/ConstantSizeRegionSplitPolicy.html">ConstantSizeRegionSplitPolicy</a>.</p>
 </div>
 <div class="paragraph">
@@ -15272,10 +15283,10 @@ Defaults to <code>hbase.hregion.memstore.flush.size</code> (128 mb).</p>
 <p>23 &#8594; Yes, because sum(12, 12) * 1.0 = 24.</p>
 </li>
 <li>
-<p>12 &#8594; Yes, because the previous file has been included, and because this does not exceed the the max-file limit of 5</p>
+<p>12 &#8594; Yes, because the previous file has been included, and because this does not exceed the max-file limit of 5</p>
 </li>
 <li>
-<p>12 &#8594; Yes, because the previous file had been included, and because this does not exceed the the max-file limit of 5.</p>
+<p>12 &#8594; Yes, because the previous file had been included, and because this does not exceed the max-file limit of 5.</p>
 </li>
 </ul>
 </div>
@@ -15688,7 +15699,7 @@ If the target table does not already exist in HBase, this tool will create the t
 <div class="sect2">
 <h3 id="arch.bulk.load.adv"><a class="anchor" href="#arch.bulk.load.adv"></a>68.5. Advanced Usage</h3>
 <div class="paragraph">
-<p>Although the <code>importtsv</code> tool is useful in many cases, advanced users may want to generate data programatically, or import data from other formats.
+<p>Although the <code>importtsv</code> tool is useful in many cases, advanced users may want to generate data programmatically, or import data from other formats.
 To get started doing so, dig into <code>ImportTsv.java</code> and check the JavaDoc for HFileOutputFormat.</p>
 </div>
 <div class="paragraph">
@@ -15815,8 +15826,8 @@ If required, this can be implemented later though.</p>
 <div class="title">Figure 3. Timeline Consistency</div>
 </div>
 <div class="paragraph">
-<p>To better understand the TIMELINE semantics, lets look at the above diagram.
-Lets say that there are two clients, and the first one writes x=1 at first, then x=2 and x=3 later.
+<p>To better understand the TIMELINE semantics, let&#8217;s look at the above diagram.
+Let&#8217;s say that there are two clients, and the first one writes x=1 at first, then x=2 and x=3 later.
 As above, all writes are handled by the primary region replica.
 The writes are saved in the write ahead log (WAL), and replicated to the other replicas asynchronously.
 In the above diagram, notice that replica_id=1 received 2 updates, and its data shows that x=2, while the replica_id=2 only received a single update, and its data shows that x=1.</p>
@@ -15880,7 +15891,7 @@ The regions opened in secondary mode will share the same data files with the pri
 <div class="sect2">
 <h3 id="_propagating_writes_to_region_replicas"><a class="anchor" href="#_propagating_writes_to_region_replicas"></a>70.5. Propagating writes to region replicas</h3>
 <div class="paragraph">
-<p>As discussed above writes only go to the primary region replica. For propagating the writes from the primary region replica to the secondaries, there are two different mechanisms. For read-only tables, you do not need to use any of the following methods. Disabling and enabling the table should make the data available in all region replicas. For mutable tables, you have to use <strong>only</strong> one of the following mechanisms: storefile refresher, or async wal replication. The latter is recommeded.</p>
+<p>As discussed above writes only go to the primary region replica. For propagating the writes from the primary region replica to the secondaries, there are two different mechanisms. For read-only tables, you do not need to use any of the following methods. Disabling and enabling the table should make the data available in all region replicas. For mutable tables, you have to use <strong>only</strong> one of the following mechanisms: storefile refresher, or async wal replication. The latter is recommended.</p>
 </div>
 <div class="sect3">
 <h4 id="_storefile_refresher"><a class="anchor" href="#_storefile_refresher"></a>70.5.1. StoreFile Refresher</h4>
@@ -15934,7 +15945,7 @@ Asyn WAL Replication feature will add a new replication peer named <code>region_
 <div class="sect2">
 <h3 id="_secondary_replica_failover"><a class="anchor" href="#_secondary_replica_failover"></a>70.9. Secondary replica failover</h3>
 <div class="paragraph">
-<p>When a secondary region replica first comes online, or fails over, it may have served some edits from it’s memstore. Since the recovery is handled differently for secondary replicas, the secondary has to ensure that it does not go back in time before it starts serving requests after assignment. For doing that, the secondary waits until it observes a full flush cycle (start flush, commit flush) or a “region open event” replicated from the primary. Until this happens, the secondary region replica will reject all read requests by throwing an IOException with message “The region&#8217;s reads are disabled”. However, the other replicas will probably still be available to read, thus not causing any impact for the rpc with TIMELINE consistency. To facilitate faster recovery, the secondary region will trigger a flush request from the primary when it is opened. The configuration property <code>hbase.region.replica.wait.for.primary.flush</code> (enabled by default) can be used to
  disable this feature if needed.</p>
+<p>When a secondary region replica first comes online, or fails over, it may have served some edits from its memstore. Since the recovery is handled differently for secondary replicas, the secondary has to ensure that it does not go back in time before it starts serving requests after assignment. For doing that, the secondary waits until it observes a full flush cycle (start flush, commit flush) or a “region open event” replicated from the primary. Until this happens, the secondary region replica will reject all read requests by throwing an IOException with message “The region&#8217;s reads are disabled”. However, the other replicas will probably still be available to read, thus not causing any impact for the rpc with TIMELINE consistency. To facilitate faster recovery, the secondary region will trigger a flush request from the primary when it is opened. The configuration property <code>hbase.region.replica.wait.for.primary.flush</code> (enabled by default) can be used to di
 sable this feature if needed.</p>
 </div>
 </div>
 <div class="sect2">
@@ -15968,7 +15979,7 @@ Instead you can change the number of region replicas per table to increase or de
     <span class="tag">&lt;name&gt;</span>hbase.region.replica.replication.enabled<span class="tag">&lt;/name&gt;</span>
     <span class="tag">&lt;value&gt;</span>true<span class="tag">&lt;/value&gt;</span>
     <span class="tag">&lt;description&gt;</span>
-      Whether asynchronous WAL replication to the secondary region replicas is enabled or not. If this is enabled, a replication peer named &quot;region_replica_replication&quot; will be created which will tail the logs and replicate the mutatations to region replicas for tables that have region replication <span class="error">&gt;</span> 1. If this is enabled once, disabling this replication also      requires disabling the replication peer using shell or ReplicationAdmin java class. Replication to secondary region replicas works over standard inter-cluster replication. So replication, if disabled explicitly, also has to be enabled by setting &quot;hbase.replication&quot;· to true for this feature to work.
+      Whether asynchronous WAL replication to the secondary region replicas is enabled or not. If this is enabled, a replication peer named &quot;region_replica_replication&quot; will be created which will tail the logs and replicate the mutations to region replicas for tables that have region replication <span class="error">&gt;</span> 1. If this is enabled once, disabling this replication also      requires disabling the replication peer using shell or ReplicationAdmin java class. Replication to secondary region replicas works over standard inter-cluster replication. So replication, if disabled explicitly, also has to be enabled by setting &quot;hbase.replication&quot;· to true for this feature to work.
     <span class="tag">&lt;/description&gt;</span>
 <span class="tag">&lt;/property&gt;</span>
 <span class="tag">&lt;property&gt;</span>
@@ -16152,7 +16163,7 @@ hbase(main):001:0&gt; get 't1','r6', {CONSISTENCY =&gt; &quot;TIMELINE&quot;}</c
 <div class="sect3">
 <h4 id="_java_2"><a class="anchor" href="#_java_2"></a>70.13.2. Java</h4>
 <div class="paragraph">
-<p>You can set set the consistency for Gets and Scans and do requests as follows.</p>
+<p>You can set the consistency for Gets and Scans and do requests as follows.</p>
 </div>
 <div class="listingblock">
 <div class="content">
@@ -16322,7 +16333,7 @@ suit your environment, and restart or rolling restart the RegionServer.</p>
     <span class="tag">&lt;value&gt;</span>1000<span class="tag">&lt;/value&gt;</span>
     <span class="tag">&lt;description&gt;</span>
       Number of opened file handlers to cache.
-      A larger value will benefit reads by provinding more file handlers per mob
+      A larger value will benefit reads by providing more file handlers per mob
       file cache and would reduce frequent file opening and closing.
       However, if this is set too high, this could lead to a &quot;too many opened file handers&quot;
       The default value is 1000.
@@ -16381,7 +16392,7 @@ hbase&gt; major_compact_mob 't1'</pre>
 <h4 id="_mob_sweeper"><a class="anchor" href="#_mob_sweeper"></a>71.4.2. MOB Sweeper</h4>
 <div class="paragraph">
 <p>HBase MOB a MapReduce job called the Sweeper tool for
-optimization. The Sweeper tool oalesces small MOB files or MOB files with many
+optimization. The Sweeper tool coalesces small MOB files or MOB files with many
 deletions or updates. The Sweeper tool is not required if you use native MOB compaction, which
 does not rely on MapReduce.</p>
 </div>
@@ -16639,7 +16650,7 @@ of the <a href="#security">Securing Apache HBase</a> chapter.</p>
 <div class="sect2">
 <h3 id="_using_rest_endpoints"><a class="anchor" href="#_using_rest_endpoints"></a>73.3. Using REST Endpoints</h3>
 <div class="paragraph">
-<p>The following examples use the placeholder server <code><a href="http://example.com:8000" class="bare">http://example.com:8000</a></code>, and
+<p>The following examples use the placeholder server http://example.com:8000, and
 the following commands can all be run using <code>curl</code> or <code>wget</code> commands. You can request
 plain text (the default), XML , or JSON output by adding no header for plain text,
 or the header "Accept: text/xml" for XML or "Accept: application/json" for JSON.</p>
@@ -17910,11 +17921,11 @@ on 4 main interaction points between Spark and HBase. Those interaction points a
 <dl>
 <dt class="hdlist1">Basic Spark</dt>
 <dd>
-<p>The ability to have a HBase Connection at any point in your Spark DAG.</p>
+<p>The ability to have an HBase Connection at any point in your Spark DAG.</p>
 </dd>
 <dt class="hdlist1">Spark Streaming</dt>
 <dd>
-<p>The ability to have a HBase Connection at any point in your Spark Streaming
+<p>The ability to have an HBase Connection at any point in your Spark Streaming
 application.</p>
 </dd>
 <dt class="hdlist1">Spark Bulk Load</dt>
@@ -18154,7 +18165,7 @@ dStream.hbaseBulkPut(
 . The hbaseContext that carries the configuration boardcast information link us
 to the HBase Connections in the executors
 . The table name of the table we are putting data into
-. A function that will convert a record in the DStream into a HBase Put object.</p>
+. A function that will convert a record in the DStream into an HBase Put object.</p>
 </div>
 </div>
 </div>
@@ -18344,7 +18355,7 @@ are <code>equal</code> operations.</p>
 </div>
 </div>
 <div class="paragraph">
-<p>Now lets look at an example where we will end up doing two scans on HBase.</p>
+<p>Now let&#8217;s look at an example where we will end up doing two scans on HBase.</p>
 </div>
 <div class="listingblock">
 <div class="content">
@@ -18582,7 +18593,7 @@ its design. Currently there are efforts going on to bridge this gap. For more in
 <dd>
 <p>Loading from configuration</p>
 </dd>
-<dt class="hdlist1">Dynammic</dt>
+<dt class="hdlist1">Dynamic</dt>
 <dd>
 <p>Loading via 'hbase shell' or via Java code using HTableDescriptor class).<br>
 For more details see <a href="#cp_loading">Loading Coprocessors</a>.</p>
@@ -18746,10 +18757,10 @@ or
 <p>From version 0.96, implementing Endpoint Coprocessor is not straight forward. Now it is done with
 the help of Google&#8217;s Protocol Buffer. For more details on Protocol Buffer, please see
 <a href="https://developers.google.com/protocol-buffers/docs/proto">Protocol Buffer Guide</a>.
-Endpoints Coprocessor written in version 0.94 are not compatible with with version 0.96 or later
+Endpoints Coprocessor written in version 0.94 are not compatible with version 0.96 or later
 (for more details, see
 <a href="https://issues.apache.org/jira/browse/HBASE-5448">HBASE-5448</a>),
-so if your are upgrading your HBase cluster from version 0.94 (or before) to 0.96 (or later) you
+so if you are upgrading your HBase cluster from version 0.94 (or before) to 0.96 (or later) you
 have to rewrite your Endpoint coprocessor.</p>
 </div>
 <div class="paragraph">
@@ -18763,7 +18774,7 @@ have to rewrite your Endpoint coprocessor.</p>
 <div class="sectionbody">
 <div class="paragraph">
 <p>_Loading  of Coprocessor refers to the process of making your custom Coprocessor implementation
-available to the the HBase, so that when a requests comes in or an event takes place the desired
+available to HBase, so that when a request comes in or an event takes place the desired
 functionality implemented in your custom code gets executed.<br>
 Coprocessor can be loaded broadly in two ways. One is static (loading through configuration files)
 and the other one is dynamic loading (using hbase shell or java code).</p>
@@ -18800,10 +18811,10 @@ sub elements &lt;name&gt; and &lt;value&gt; respectively.</p>
 </div>
 </li>
 <li>
-<p>&lt;value&gt; must contain the fully qualified class name of your class implmenting the Coprocessor.</p>
+<p>&lt;value&gt; must contain the fully qualified class name of your class implementing the Coprocessor.</p>
 <div class="paragraph">
 <p>For example to load a Coprocessor (implemented in class SumEndPoint.java) you have to create
-following entry in RegionServer&#8217;s 'hbase-site.xml' file (generally located under 'conf' directiory):</p>
+following entry in RegionServer&#8217;s 'hbase-site.xml' file (generally located under 'conf' directory):</p>
 </div>
 <div class="listingblock">
 <div class="content">
@@ -18835,7 +18846,7 @@ Ties are broken arbitrarily.</p>
 </div>
 </li>
 <li>
-<p>Put your code on classpth of HBase: There are various ways to do so, like adding jars on
+<p>Put your code on classpath of HBase: There are various ways to do so, like adding jars on
 classpath etc. One easy way to do this is to drop the jar (containing you code and all the
 dependencies) in 'lib' folder of the HBase installation.</p>
 </li>
@@ -19072,7 +19083,7 @@ hbase(main):<span class="octal">004</span>:<span class="integer">0</span>*   NAM
 </div>
 </li>
 <li>
-<p>Using HtableDescriptor: Simply reload the table definition <em>without</em> setting the value of
+<p>Using HTableDescriptor: Simply reload the table definition <em>without</em> setting the value of
 Coprocessor either in setValue() or addCoprocessor() methods. This will remove the Coprocessor
 attached to this table, if any. For example:</p>
 <div class="listingblock">
@@ -19315,12 +19326,12 @@ exported in a file called 'coprocessor.jar'.</p>
 <div class="listingblock">
 <div class="content">
 <pre class="CodeRay highlight"><code data-lang="java"><span class="predefined-type">Configuration</span> conf = HBaseConfiguration.create();
-<span class="comment">// Use below code for HBase verion 1.x.x or above.</span>
+<span class="comment">// Use below code for HBase version 1.x.x or above.</span>
 <span class="predefined-type">Connection</span> connection = ConnectionFactory.createConnection(conf);
 TableName tableName = TableName.valueOf(<span class="string"><span class="delimiter">&quot;</span><span class="content">users</span><span class="delimiter">&quot;</span></span>);
 Table table = connection.getTable(tableName);
 
-<span class="comment">//Use below code HBase verion 0.98.xx or below.</span>
+<span class="comment">//Use below code HBase version 0.98.xx or below.</span>
 <span class="comment">//HConnection connection = HConnectionManager.createConnection(conf);</span>
 <span class="comment">//HTableInterface table = connection.getTable(&quot;users&quot;);</span>
 
@@ -19510,12 +19521,12 @@ following code as shown below:</p>
 <div class="listingblock">
 <div class="content">
 <pre class="CodeRay highlight"><code data-lang="java"><span class="predefined-type">Configuration</span> conf = HBaseConfiguration.create();
-<span class="comment">// Use below code for HBase verion 1.x.x or above.</span>
+<span class="comment">// Use below code for HBase version 1.x.x or above.</span>
 <span class="predefined-type">Connection</span> connection = ConnectionFactory.createConnection(conf);
 TableName tableName = TableName.valueOf(<span class="string"><span class="delimiter">&quot;</span><span class="content">users</span><span class="delimiter">&quot;</span></span>);
 Table table = connection.getTable(tableName);
 
-<span class="comment">//Use below code HBase verion 0.98.xx or below.</span>
+<span class="comment">//Use below code HBase version 0.98.xx or below.</span>
 <span class="comment">//HConnection connection = HConnectionManager.createConnection(conf);</span>
 <span class="comment">//HTableInterface table = connection.getTable(&quot;users&quot;);</span>
 
@@ -19669,7 +19680,7 @@ single 48 port as opposed to 2x 24 port</p>
 </ul>
 </div>
 <div class="paragraph">
-<p>If the the switches in your rack have appropriate switching capacity to handle all the hosts at full speed, the next most likely issue will be caused by homing more of your cluster across racks.
+<p>If the switches in your rack have appropriate switching capacity to handle all the hosts at full speed, the next most likely issue will be caused by homing more of your cluster across racks.
 The easiest way to avoid issues when spanning multiple racks is to use port trunking to create a bonded uplink to other racks.
 The downside of this method however, is in the overhead of ports that could potentially be used.
 An example of this is, creating an 8Gbps port channel from rack A to rack B, using 8 of your 24 ports to communicate between racks gives you a poor ROI, using too few however can mean you&#8217;re not getting the most out of your cluster.</p>
@@ -19687,7 +19698,7 @@ An example of this is, creating an 8Gbps port channel from rack A to rack B, usi
 <div class="sect2">
 <h3 id="perf.network.call_me_maybe"><a class="anchor" href="#perf.network.call_me_maybe"></a>90.5. Network Consistency and Partition Tolerance</h3>
 <div class="paragraph">
-<p>The <a href="http://en.wikipedia.org/wiki/CAP_theorem">CAP Theorem</a> states that a distributed system can maintain two out of the following three charateristics:
+<p>The <a href="http://en.wikipedia.org/wiki/CAP_theorem">CAP Theorem</a> states that a distributed system can maintain two out of the following three characteristics:
 - *C*onsistency&#8201;&#8212;&#8201;all nodes see the same data.
 - *A*vailability&#8201;&#8212;&#8201;every request receives a response about whether it succeeded or failed.
 - *P*artition tolerance&#8201;&#8212;&#8201;the system continues to operate even if some of its components become unavailable to the others.</p>
@@ -20286,7 +20297,7 @@ When a Reducer step is used, all of the output (Puts) from the Mapper will get s
 It&#8217;s far more efficient to just write directly to HBase.</p>
 </div>
 <div class="paragraph">
-<p>For summary jobs where HBase is used as a source and a sink, then writes will be coming from the Reducer step (e.g., summarize values then write out result). This is a different processing problem than from the the above case.</p>
+<p>For summary jobs where HBase is used as a source and a sink, then writes will be coming from the Reducer step (e.g., summarize values then write out result). This is a different processing problem than from the above case.</p>
 </div>
 </div>
 <div class="sect2">
@@ -20297,7 +20308,7 @@ It&#8217;s far more efficient to just write directly to HBase.</p>
 <div class="paragraph">
 <p>Also, if you are pre-splitting regions and all your data is <em>still</em> winding up in a single region even though your keys aren&#8217;t monotonically increasing, confirm that your keyspace actually works with the split strategy.
 There are a variety of reasons that regions may appear "well split" but won&#8217;t work with your data.
-As the HBase client communicates directly with the RegionServers, this can be obtained via <a href="hhttp://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#getRegionLocation(byte" class="bare">hhttp://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#getRegionLocation(byte</a>)[Table.getRegionLocation].</p>
+As the HBase client communicates directly with the RegionServers, this can be obtained via <a href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#getRegionLocation(byte%5B%5D)">Table.getRegionLocation</a>.</p>
 </div>
 <div class="paragraph">
 <p>See <a href="#precreate.regions">Table Creation: Pre-Creating Regions</a>, as well as <a href="#perf.configurations">HBase Configurations</a></p>
@@ -20349,7 +20360,7 @@ When rows have few columns and each column has only a few versions this can be i
 A seek operation is generally slower if does not seek at least past 5-10 columns/versions or 512-1024 bytes.</p>
 </div>
 <div class="paragraph">
-<p>In order to opportunistically look ahead a few columns/versions to see if the next column/version can be found that way before a seek operation is scheduled, a new attribute <code>Scan.HINT_LOOKAHEAD</code> can be set the on Scan object.
+<p>In order to opportunistically look ahead a few columns/versions to see if the next column/version can be found that way before a seek operation is scheduled, a new attribute <code>Scan.HINT_LOOKAHEAD</code> can be set on the Scan object.
 The following code instructs the RegionServer to attempt two iterations of next before a seek is scheduled:</p>
 </div>
 <div class="listingblock">
@@ -20498,7 +20509,7 @@ Whichever read returns first is used, and the other read request is discarded.
 Hedged reads can be helpful for times where a rare slow read is caused by a transient error such as a failing disk or flaky network connection.</p>
 </div>
 <div class="paragraph">
-<p>Because a HBase RegionServer is a HDFS client, you can enable hedged reads in HBase, by adding the following properties to the RegionServer&#8217;s hbase-site.xml and tuning the values to suit your environment.</p>
+<p>Because an HBase RegionServer is a HDFS client, you can enable hedged reads in HBase, by adding the following properties to the RegionServer&#8217;s hbase-site.xml and tuning the values to suit your environment.</p>
 </div>
 <div class="ulist">
 <div class="title">Configuration for Hedged Reads</div>
@@ -20687,7 +20698,7 @@ In terms of running tests on EC2, run them several times for the same reason (i.
 <div class="sectionbody">
 <div class="paragraph">
 <p>It is often recommended to have different clusters for HBase and MapReduce.
-A better qualification of this is: don&#8217;t collocate a HBase that serves live requests with a heavy MR workload.
+A better qualification of this is: don&#8217;t collocate an HBase that serves live requests with a heavy MR workload.
 OLTP and OLAP-optimized systems have conflicting requirements and one will lose to the other, usually the former.
 For example, short latency-sensitive disk reads will have to wait in line behind longer reads that are trying to squeeze out as much throughput as possible.
 MR jobs that write to HBase will also generate flushes and compactions, which will in turn invalidate blocks in the <a href="#block.cache">Block Cache</a>.</p>
@@ -20789,11 +20800,11 @@ The HBase Master is typically run on the NameNode server, and well as ZooKeeper.
 <div class="sect3">
 <h4 id="rpc.logging"><a class="anchor" href="#rpc.logging"></a>104.2.1. Enabling RPC-level logging</h4>
 <div class="paragraph">
-<p>Enabling the RPC-level logging on a RegionServer can often given insight on timings at the server.
+<p>Enabling the RPC-level logging on a RegionServer can often give insight on timings at the server.
 Once enabled, the amount of log spewed is voluminous.
 It is not recommended that you leave this logging on for more than short bursts of time.
 To enable RPC-level logging, browse to the RegionServer UI and click on <em>Log Level</em>.
-Set the log level to <code>DEBUG</code> for the package <code>org.apache.hadoop.ipc</code> (Thats right, for <code>hadoop.ipc</code>, NOT, <code>hbase.ipc</code>). Then tail the RegionServers log.
+Set the log level to <code>DEBUG</code> for the package <code>org.apache.hadoop.ipc</code> (That&#8217;s right, for <code>hadoop.ipc</code>, NOT, <code>hbase.ipc</code>). Then tail the RegionServers log.
 Analyze.</p>
 </div>
 <div class="paragraph">
@@ -20895,7 +20906,7 @@ CMS pauses are always low, but if your ParNew starts growing, you can see minor
 </div>
 <div class="paragraph">
 <p>This can be due to the size of the ParNew, which should be relatively small.
-If your ParNew is very large after running HBase for a while, in one example a ParNew was about 150MB, then you might have to constrain the size of ParNew (The larger it is, the longer the collections take but if its too small, objects are promoted to old gen too quickly). In the below we constrain new gen size to 64m.</p>
+If your ParNew is very large after running HBase for a while, in one example a ParNew was about 150MB, then you might have to constrain the size of ParNew (The larger it is, the longer the collections take but if it&#8217;s too small, objects are promoted to old gen too quickly). In the below we constrain new gen size to 64m.</p>
 </div>
 <div class="paragraph">
 <p>Add the below line in <em>hbase-env.sh</em>:</p>
@@ -21203,7 +21214,7 @@ java.lang.Thread.State: WAITING (on object monitor)
 </div>
 </div>
 <div class="paragraph">
-<p>A handler thread that&#8217;s waiting for stuff to do (like put, delete, scan, etc):</p>
+<p>A handler thread that&#8217;s waiting for stuff to do (like put, delete, scan, etc.):</p>
 </div>
 <div class="listingblock">
 <div class="content">
@@ -21678,7 +21689,7 @@ are snapshots and WALs.</p>
 <dt class="hdlist1">Snapshots</dt>
 <dd>
 <p>When you create a snapshot, HBase retains everything it needs to recreate the table&#8217;s
-state at that time of tne snapshot. This includes deleted cells or expired versions.
+state at that time of the snapshot. This includes deleted cells or expired versions.
 For this reason, your snapshot usage pattern should be well-planned, and you should
 prune snapshots that you no longer need. Snapshots are stored in <code>/hbase/.snapshots</code>,
 and archives needed to restore snapshots are stored in
@@ -21946,7 +21957,7 @@ This exception is returned back to the client and then the client goes back to <
 <div class="paragraph">
 <p>Fix your DNS.
 In versions of Apache HBase before 0.92.x, reverse DNS needs to give same answer as forward lookup.
-See <a href="https://issues.apache.org/jira/browse/HBASE-3431">HBASE 3431 RegionServer is not using the name given it by the master; double entry in master listing of servers</a> for gorey details.</p>
+See <a href="https://issues.apache.org/jira/browse/HBASE-3431">HBASE 3431 RegionServer is not using the name given it by the master; double entry in master listing of servers</a> for gory details.</p>
 </div>
 </div>
 <div class="sect3">
@@ -22548,7 +22559,7 @@ These jobs were consistently found to be waiting on map and reduce tasks assigne
 <p>Two 12-core processors</p>
 </li>
 <li>
-<p>Six Enerprise SATA disks</p>
+<p>Six Enterprise SATA disks</p>
 </li>
 <li>
 <p>24GB of RAM</p>
@@ -22926,7 +22937,7 @@ This run sets the timeout value to 60 seconds, the default value is 600 seconds.
 <div class="paragraph">
 <p>By default, the canary tool only check the read operations, it&#8217;s hard to find the problem in the
 write path. To enable the write sniffing, you can run canary with the <code>-writeSniffing</code> option.
-When the write sniffing is enabled, the canary tool will create a hbase table and make sure the
+When the write sniffing is enabled, the canary tool will create an hbase table and make sure the
 regions of the table distributed on all region servers. In each sniffing period, the canary will
 try to put data to these regions to check the write availability of each region server.</p>
 </div>
@@ -23138,7 +23149,7 @@ You can invoke it via the HBase cli with the 'wal' command.</p>
 <div class="title">WAL Printing in older versions of HBase</div>
 <div class="paragraph">
 <p>Prior to version 2.0, the WAL Pretty Printer was called the <code>HLogPrettyPrinter</code>, after an internal name for HBase&#8217;s write ahead log.
-In those versions, you can pring the contents of a WAL using the same configuration as above, but with the 'hlog' command.</p>
+In those versions, you can print the contents of a WAL using the same configuration as above, but with the 'hlog' command.</p>
 </div>
 <div class="listingblock">
 <div class="content">
@@ -23358,7 +23369,7 @@ row10	c1	c2</pre>
 </div>
 </div>
 <div class="paragraph">
-<p>For ImportTsv to use this imput file, the command line needs to look like this:</p>
+<p>For ImportTsv to use this input file, the command line needs to look like this:</p>
 </div>
 <div class="listingblock">
 <div class="content">
@@ -23714,7 +23725,7 @@ Usage: graceful_stop.sh [--config &amp;conf-dir&gt;] [--restart] [--reload] [--t
 <div class="paragraph">
 <p>The <code>HOSTNAME</code> passed to <em>graceful_stop.sh</em> must match the hostname that hbase is using to identify RegionServers.
 Check the list of RegionServers in the master UI for how HBase is referring to servers.
-Its usually hostname but can also be FQDN.
+It&#8217;s usually hostname but can also be FQDN.
 Whatever HBase is using, this is what you should pass the <em>graceful_stop.sh</em> decommission script.
 If you pass IPs, the script is not yet smart enough to make a hostname (or FQDN) of it and so it will fail when it checks if server is currently running; the graceful unloading of regions will not run.</p>
 </div>
@@ -23769,13 +23780,13 @@ Hence, it is better to manage the balancer apart from <code>graceful_stop</code>
 <div class="sect3">
 <h4 id="draining.servers"><a class="anchor" href="#draining.servers"></a>128.1.1. Decommissioning several Regions Servers concurrently</h4>
 <div class="paragraph">
-<p>If you have a large cluster, you may want to decommission more than one machine at a time by gracefully stopping mutiple RegionServers concurrently.
+<p>If you have a large cluster, you may want to decommission more than one machine at a time by gracefully stopping multiple RegionServers concurrently.
 To gracefully drain multiple regionservers at the same time, RegionServers can be put into a "draining" state.
 This is done by marking a RegionServer as a draining node by creating an entry in ZooKeeper under the <em>hbase_root/draining</em> znode.
 This znode has format <code>name,port,startcode</code> just like the regionserver entries under <em>hbase_root/rs</em> znode.</p>
 </div>
 <div class="paragraph">
-<p>Without this facility, decommissioning mulitple nodes may be non-optimal because regions that are being drained from one region server may be moved to other regionservers that are also draining.
+<p>Without this facility, decommissioning multiple nodes may be non-optimal because regions that are being drained from one region server may be moved to other regionservers that are also draining.
 Marking RegionServers to be in the draining state prevents this from happening.
 See this <a href="http://inchoate-clatter.blogspot.com/2012/03/hbase-ops-automation.html">blog
             post</a> for more details.</p>
@@ -23992,7 +24003,7 @@ Restart the region server for the changes to take effect.</p>
 </div>
 <div class="paragraph">
 <p>To change the sampling rate for the default sink, edit the line beginning with <code>*.period</code>.
-To filter which metrics are emitted or to extend the metrics framework, see link:http://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics2/package-summary.html</p>
+To filter which metrics are emitted or to extend the metrics framework, see <a href="http://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics2/package-summary.html" class="bare">http://hadoop.apache.org/docs/current/api/org/apache/hadoop/metrics2/package-summary.html</a></p>
 </div>
 <div class="admonitionblock note">
 <table>
@@ -24030,19 +24041,19 @@ Different metrics are exposed for the Master process and each region server proc
 <div class="title">Procedure: Access a JSON Output of Available Metrics</div>
 <ol class="arabic">
 <li>
-<p>After starting HBase, access the region server&#8217;s web UI, at <code><a href="http://REGIONSERVER_HOSTNAME:60030" class="bare">http://REGIONSERVER_HOSTNAME:60030</a></code> by default (or port 16030 in HBase 1.0+).</p>
+<p>After starting HBase, access the region server&#8217;s web UI, at http://REGIONSERVER_HOSTNAME:60030 by default (or port 16030 in HBase 1.0+).</p>
 </li>
 <li>
 <p>Click the <span class="label">Metrics

<TRUNCATED>

Mime
View raw message