hbase-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bus...@apache.org
Subject [15/15] hbase git commit: HBASE-14066 clean out old docbook docs from branch-1.
Date Tue, 14 Jul 2015 02:50:31 GMT
HBASE-14066 clean out old docbook docs from branch-1.


Project: http://git-wip-us.apache.org/repos/asf/hbase/repo
Commit: http://git-wip-us.apache.org/repos/asf/hbase/commit/09a422e8
Tree: http://git-wip-us.apache.org/repos/asf/hbase/tree/09a422e8
Diff: http://git-wip-us.apache.org/repos/asf/hbase/diff/09a422e8

Branch: refs/heads/branch-1.2
Commit: 09a422e80ece5364a8f7df74e652a80e69d91354
Parents: 4b6f014
Author: Sean Busbey <busbey@apache.org>
Authored: Mon Jul 13 14:44:00 2015 -0500
Committer: Sean Busbey <busbey@apache.org>
Committed: Mon Jul 13 21:51:00 2015 -0500

----------------------------------------------------------------------
 src/main/docbkx/appendix_acl_matrix.xml         |  662 --
 .../appendix_contributing_to_documentation.xml  |  426 --
 src/main/docbkx/appendix_hfile_format.xml       |  657 --
 src/main/docbkx/book.xml                        | 6069 ------------------
 src/main/docbkx/case_studies.xml                |  239 -
 src/main/docbkx/community.xml                   |  149 -
 src/main/docbkx/configuration.xml               | 1653 -----
 src/main/docbkx/cp.xml                          |  431 --
 src/main/docbkx/customization.xsl               |   49 -
 src/main/docbkx/developer.xml                   | 2343 -------
 src/main/docbkx/external_apis.xml               |   79 -
 src/main/docbkx/getting_started.xml             |  728 ---
 src/main/docbkx/hbase_apis.xml                  |  133 -
 src/main/docbkx/ops_mgt.xml                     | 2488 -------
 src/main/docbkx/performance.xml                 | 1207 ----
 src/main/docbkx/preface.xml                     |   83 -
 src/main/docbkx/rpc.xml                         |  301 -
 src/main/docbkx/schema_design.xml               | 1247 ----
 src/main/docbkx/security.xml                    | 1895 ------
 src/main/docbkx/shell.xml                       |  386 --
 src/main/docbkx/thrift_filter_language.xml      |  757 ---
 src/main/docbkx/tracing.xml                     |  187 -
 src/main/docbkx/troubleshooting.xml             | 1700 -----
 src/main/docbkx/unit_testing.xml                |  330 -
 src/main/docbkx/upgrading.xml                   |  833 ---
 src/main/docbkx/zookeeper.xml                   |  513 --
 26 files changed, 25545 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hbase/blob/09a422e8/src/main/docbkx/appendix_acl_matrix.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/appendix_acl_matrix.xml b/src/main/docbkx/appendix_acl_matrix.xml
deleted file mode 100644
index a0d4695..0000000
--- a/src/main/docbkx/appendix_acl_matrix.xml
+++ /dev/null
@@ -1,662 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<appendix version="5.0" xml:id="appendix_acl_matrix"
-    xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink"
-    xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:svg="http://www.w3.org/2000/svg"
-    xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:html="http://www.w3.org/1999/xhtml"
-    xmlns:db="http://docbook.org/ns/docbook">
-    <!--
-/**
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
--->
-
-    <title>Access Control Matrix</title>
-      <para>The following matrix shows the minimum permission set required to perform operations in
-        HBase. Before using the table, read through the information about how to interpret it.</para>
-      <variablelist>
-        <title>Interpreting the ACL Matrix Table</title>
-        <para>The following conventions are used in the ACL Matrix table:</para>
-        <varlistentry>
-          <term>Scopes</term>
-          <listitem>
-            <para>Permissions are evaluated starting at the widest scope and working to the
-              narrowest scope. A scope corresponds to a level of the data model. From broadest to
-              narrowest, the scopes are as follows::</para>
-            <itemizedlist>
-              <listitem><para>Global</para></listitem>
-              <listitem><para>Namespace (NS)</para></listitem>
-              <listitem><para>Table</para></listitem>
-              <listitem><para>Column Family (CF)</para></listitem>
-              <listitem><para>Column Qualifier (CQ)</para></listitem>
-              <listitem><para>Cell</para></listitem>
-            </itemizedlist>
-                <para>For instance, a permission granted at table level dominates any grants done at
-                    the Column Family, Column Qualifier, or cell level. The user can do what that
-                    grant implies at any location in the table. A permission granted at global scope
-                    dominates all: the user is always allowed to take that action everywhere.</para>
-          </listitem>
-        </varlistentry>
-        <varlistentry>
-          <term>Permissions</term>
-          <listitem>
-            <para>Possible permissions include the following:</para>
-            <itemizedlist>
-              <listitem><para>Superuser - a special user that belongs to group "supergroup" and has
-                unlimited access</para></listitem>
-              <listitem><para>Admin (A)</para></listitem>
-              <listitem><para>Create (C)</para></listitem>
-              <listitem><para>Write (W)</para></listitem>
-              <listitem><para>Read (R)</para></listitem>
-              <listitem><para>Execute (X)</para></listitem>
-            </itemizedlist>
-          </listitem>
-        </varlistentry>
-      </variablelist>
-
-      <para>For the most part, permissions work in an expected way, with the following caveats:</para>
-      <itemizedlist>
-        <listitem>
-          <para>Having Write permission does not imply Read permission. It is possible and sometimes
-          desirable for a user to be able to write data that same user cannot read. One such example
-          is a log-writing process.</para>
-        </listitem>
-        <listitem>
-          <para>The <systemitem>hbase:meta</systemitem> table is readable by every user, regardless
-            of the user's other grants or restrictions. This is a requirement for HBase to
-            function correctly.</para>
-        </listitem>
-        <listitem>
-            <para><code>CheckAndPut</code> and <code>CheckAndDelete</code> operations will fail if
-                the user does not have both Write and Read permission.</para>
-        </listitem>
-        <listitem>
-            <para><code>Increment</code> and <code>Append</code> operations do not require Read
-                access.</para>
-        </listitem>
-      </itemizedlist>
-
-    <para>The following table is sorted by the interface that provides each operation. In case the
-        table goes out of date, the unit tests which check for accuracy of permissions can be found
-        in
-            <filename>hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java</filename>,
-        and the access controls themselves can be examined in
-            <filename>hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java</filename>.</para>
-
-    <table frame="all">
-        <title>ACL Matrix</title>
-        <tgroup cols="4">
-            <thead>
-                <row>
-                    <entry>Interface</entry>
-                    <entry>Operation</entry>
-                    <entry>Minimum Scope</entry>
-                    <entry>Minimum Permission</entry>
-                </row>
-            </thead>
-            <tbody>
-                <row>
-                    <entry morerows="27">
-                        <!-- incrememt this if you add another "master" operation -->
-                        <para>Master</para>
-                    </entry>
-                    <entry>
-                        <para>createTable</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>C</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>modifyTable</para>
-                    </entry>
-                    <entry>
-                        <para>Table</para>
-                    </entry>
-                    <entry>
-                        <para>A|C</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>deleteTable</para>
-                    </entry>
-                    <entry>
-                        <para>Table</para>
-                    </entry>
-                    <entry>
-                        <para>A|C</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>truncateTable</para>
-                    </entry>
-                    <entry>
-                        <para>Table</para>
-                    </entry>
-                    <entry>
-                        <para>A|C</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>addColumn</para>
-                    </entry>
-                    <entry>
-                        <para>Table</para>
-                    </entry>
-                    <entry>
-                        <para>A|C</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>modifyColumn</para>
-                    </entry>
-                    <entry>
-                        <para>Table</para>
-                    </entry>
-                    <entry>
-                        <para>A|C</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>deleteColumn</para>
-                    </entry>
-                    <entry>
-                        <para>Table</para>
-                    </entry>
-                    <entry>
-                        <para>A|C</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>disableTable</para>
-                    </entry>
-                    <entry>
-                        <para>Table</para>
-                    </entry>
-                    <entry>
-                        <para>A|C</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>disableAclTable</para>
-                    </entry>
-                    <entry>
-                        <para>None</para>
-                    </entry>
-                    <entry>
-                        <para>Not allowed</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>enableTable</para>
-                    </entry>
-                    <entry>
-                        <para>Table</para>
-                    </entry>
-                    <entry>
-                        <para>A|C</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>move</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>assign</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>unassign</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>regionOffline</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>balance</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>balanceSwitch</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>shutdown</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>stopMaster</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>snapshot</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>clone</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>restore</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>deleteSnapshot</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>createNamespace</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>deleteNamespace</para>
-                    </entry>
-                    <entry>
-                        <para>Namespace</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>modifyNamespace</para>
-                    </entry>
-                    <entry>
-                        <para>Namespace</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>flushTable</para>
-                    </entry>
-                    <entry>
-                        <para>Table</para>
-                    </entry>
-                    <entry>
-                        <para>A|C</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>getTableDescriptors</para>
-                    </entry>
-                    <entry>
-                        <para>Global|Table</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>mergeRegions</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry morerows="24">Region</entry>
-                    <!-- Incrememt this if you add any more Region
-                operations -->
-                    <entry>open</entry>
-                    <entry>Global</entry>
-                    <entry>A</entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>openRegion</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>close</entry>
-                    <entry>Global</entry>
-                    <entry>A</entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>closeRegion</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>stopRegionServer</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                  <entry>
-                    <para>rollHLog</para>
-                  </entry>
-                  <entry>
-                    <para>Global</para>
-                  </entry>
-                  <entry>
-                    <para>A</para>
-                  </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>mergeRegions</para>
-                    </entry>
-                    <entry>
-                        <para>Global</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>append</entry>
-                    <entry>Table|CF|CQ</entry>
-                    <entry>W</entry>
-                </row>
-                <row>
-                    <entry>delete</entry>
-                    <entry>Table|CF|CQ|Cell (if the user has write permission for all cells)</entry>
-                    <entry>W</entry>
-                </row>
-                <row>
-                    <entry>exists</entry>
-                    <entry>Table|CF|CQ</entry>
-                    <entry>R</entry>
-                </row>
-                <row>
-                    <entry>get</entry>
-                    <entry>Table|CF|CQ</entry>
-                    <entry>R</entry>
-                </row>
-                <row>
-                    <entry>getClosestRowBefore</entry>
-                    <entry>Table|CF|CQ</entry>
-                    <entry>R</entry>
-                </row>
-                <row>
-                    <entry>increment</entry>
-                    <entry>Table|CF|CQ</entry>
-                    <entry>W</entry>
-                </row>
-                <row>
-                    <entry>put</entry>
-                    <entry>Table|CF|CQ</entry>
-                    <entry>W</entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>flush</para>
-                    </entry>
-                    <entry>
-                        <para>Global|Table</para>
-                    </entry>
-                    <entry>
-                        <para>A|C</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>split</para>
-                    </entry>
-                    <entry>
-                        <para>Global|Table</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>compact</para>
-                    </entry>
-                    <entry>
-                        <para>Global|Table</para>
-                    </entry>
-                    <entry>
-                        <para>A|C</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>bulkLoadHFile</entry>
-                    <entry>Table</entry>
-                    <entry>W</entry>
-                </row>
-                <row>
-                    <entry>prepareBulkLoad</entry>
-                    <entry>Table</entry>
-                    <entry>C</entry>
-                </row>
-                <row>
-                    <entry>cleanupBulkLoad</entry>
-                    <entry>Table</entry>
-                    <entry>W</entry>
-                </row>
-                <row>
-                    <entry>checkAndDelete</entry>
-                    <entry>Table|CF|CQ</entry>
-                    <entry>RW</entry>
-                </row>
-                <row>
-                    <entry>checkAndPut</entry>
-                    <entry>Table|CF|CQ</entry>
-                    <entry>RW</entry>
-                </row>
-                <row>
-                    <entry>incrementColumnValue</entry>
-                    <entry>Table|CF|CQ</entry>
-                    <entry>RW</entry>
-                </row>
-                <row>
-                    <entry>scannerClose</entry>
-                    <entry>Table</entry>
-                    <entry>R</entry>
-                </row>
-                <row>
-                    <entry>scannerNext</entry>
-                    <entry>Table</entry>
-                    <entry>R</entry>
-                </row>
-                <row>
-                    <entry>scannerOpen</entry>
-                    <entry>Table|CQ|CF</entry>
-                    <entry>R</entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>Endpoint</para>
-                    </entry>
-                    <entry>
-                        <para>invoke</para>
-                    </entry>
-                    <entry>Endpoint</entry>
-                    <entry>
-                        <para>X</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry morerows="3">
-                        <para>AccessController</para>
-                    </entry>
-                    <entry>
-                        <para>grant</para>
-                    </entry>
-                    <entry>Global|Table|NS</entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>revoke</para>
-                    </entry>
-                    <entry>Global|Table|NS</entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>getUserPermissions</para>
-                    </entry>
-                    <entry>
-                        <para>Global|Table|NS</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-                <row>
-                    <entry>
-                        <para>checkPermissions</para>
-                    </entry>
-                    <entry>
-                        <para>Global|Table|NS</para>
-                    </entry>
-                    <entry>
-                        <para>A</para>
-                    </entry>
-                </row>
-            </tbody>
-        </tgroup>
-    </table>
-</appendix>    

http://git-wip-us.apache.org/repos/asf/hbase/blob/09a422e8/src/main/docbkx/appendix_contributing_to_documentation.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/appendix_contributing_to_documentation.xml b/src/main/docbkx/appendix_contributing_to_documentation.xml
deleted file mode 100644
index 2f19c7b..0000000
--- a/src/main/docbkx/appendix_contributing_to_documentation.xml
+++ /dev/null
@@ -1,426 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<appendix version="5.0" xml:id="appendix_contributing_to_documentation"
-    xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink"
-    xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:svg="http://www.w3.org/2000/svg"
-    xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:html="http://www.w3.org/1999/xhtml"
-    xmlns:db="http://docbook.org/ns/docbook">
-    <!--
-/**
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
--->
-
-    <title>Contributing to Documentation</title>
-    <para>The Apache HBase project welcomes contributions to all aspects of the project, including
-        the documentation. In HBase, documentation includes the following areas, and probably some
-        others:</para>
-    <itemizedlist>
-        <listitem>
-            <para>The <link xlink:href="http://hbase.apache.org/book.html">HBase Reference
-                    Guide</link> (this book)</para>
-        </listitem>
-        <listitem>
-            <para>The <link xlink:href="http://hbase.apache.org/">HBase website</link>e</para>
-        </listitem>
-        <listitem>
-            <para>The <link xlink:href="http://wiki.apache.org/hadoop/Hbase">HBase
-                Wiki</link></para>
-        </listitem>
-        <listitem>
-            <para>API documentation</para>
-        </listitem>
-        <listitem>
-            <para>Command-line utility output and help text</para>
-        </listitem>
-        <listitem>
-            <para>Web UI strings, explicit help text, context-sensitive strings, and others</para>
-        </listitem>
-        <listitem>
-            <para>Log messages</para>
-        </listitem>
-        <listitem>
-            <para>Comments in source files, configuration files, and others</para>
-        </listitem>
-        <listitem>
-            <para>Localization of any of the above into target languages other than English</para>
-        </listitem>
-    </itemizedlist>
-    <para>No matter which area you want to help out with, the first step is almost always to
-        download (typically by cloning the Git repository) and familiarize yourself with the HBase
-        source code. The only exception in the list above is the HBase Wiki, which is edited online.
-        For information on downloading and building the source, see <xref linkend="developer"
-        />.</para>
-
-    <section>
-        <title>Getting Access to the Wiki</title>
-        <para>The HBase Wiki is not well-maintained and much of its content has been moved into the
-            HBase Reference Guide (this guide). However, some pages on the Wiki are well maintained,
-            and it would be great to have some volunteers willing to help out with the Wiki. To
-            request access to the Wiki, register a new account at <link
-                xlink:href="https://wiki.apache.org/hadoop/Hbase?action=newaccount"
-                >https://wiki.apache.org/hadoop/Hbase?action=newaccount</link>. Contact one of the
-            HBase committers, who can either give you access or refer you to someone who can.</para>
-    </section>
-    <section>
-        <title>Contributing to Documentation or Other Strings</title>
-        <para> If you spot an error in a string in a UI, utility, script, log message, or elsewhere,
-            or you think something could be made more clear, or you think text needs to be added
-            where it doesn't currently exist, the first step is to file a JIRA. Be sure to set the
-            component to <literal>Documentation</literal> in addition any other involved components.
-            Most components have one or more default owners, who monitor new issues which come into
-            those queues. Regardless of whether you feel able to fix the bug, you should still file
-            bugs where you see them.</para>
-        <para>If you want to try your hand at fixing your newly-filed bug, assign it to yourself.
-            You will need to clone the HBase Git repository to your local system and work on the
-            issue there. When you have developed a potential fix, submit it for review. If it
-            addresses the issue and is seen as an improvement, one of the HBase committers will
-            commit it to one or more branches, as appropriate.</para>
-        <procedure xml:id="submit_doc_patch_procedure">
-            <title>Suggested Work flow for Submitting Patches</title>
-            <para>This procedure goes into more detail than Git pros will need, but is included in
-                this appendix so that people unfamiliar with Git can feel confident contributing to
-                HBase while they learn.</para>
-            <step>
-                <para>If you have not already done so, clone the Git repository locally. You only
-                    need to do this once.</para>
-            </step>
-            <step>
-                <para>Fairly often, pull remote changes into your local repository by using the
-                        <code>git pull</code> command, while your master branch is checked
-                    out.</para>
-            </step>
-            <step>
-                <para>For each issue you work on, create a new branch. One convention that works
-                    well for naming the branches is to name a given branch the same as the JIRA it
-                    relates to:</para>
-                <screen language="bourne">$ git checkout -b HBASE-123456</screen>
-            </step>
-            <step>
-                <para>Make your suggested changes on your branch, committing your changes to your
-                    local repository often. If you need to switch to working on a different issue,
-                    remember to check out the appropriate branch.</para>
-            </step>
-            <step>
-                <para>When you are ready to submit your patch, first be sure that HBase builds
-                    cleanly and behaves as expected in your modified branch. If you have made
-                    documentation changes, be sure the documentation and website builds.</para>
-                <note>
-                    <para>Before you use the <literal>site</literal> target the very first time, be
-                        sure you have built HBase at least once, in order to fetch all the Maven
-                        dependencies you need.</para>
-                </note>
-                <screen language="bourne">$ mvn clean install -DskipTests               # Builds HBase</screen>
-                <screen language="bourne">$ mvn clean site -DskipTests                  # Builds the website and documentation</screen>
-                <para>If any errors occur, address them.</para>
-            </step>
-            <step>
-                <para>If it takes you several days or weeks to implement your fix, or you know that
-                    the area of the code you are working in has had a lot of changes lately, make
-                    sure you rebase your branch against the remote master and take care of any
-                    conflicts before submitting your patch.</para>
-                <screen language="bourne">
-$ git checkout HBASE-123456
-$ git rebase origin/master                
-                </screen>
-            </step>
-            <step>
-                <para>Generate your patch against the remote master. Run the following command from
-                    the top level of your git repository (usually called
-                    <literal>hbase</literal>):</para>
-                <screen language="bourne">$ git diff --no-prefix origin/master > HBASE-123456.patch</screen>
-                <para>The name of the patch should contain the JIRA ID. Look over the patch file to
-                    be sure that you did not change any additional files by accident and that there
-                    are no other surprises. When you are satisfied, attach the patch to the JIRA and
-                    click the <guibutton>Patch Available</guibutton> button. A reviewer
-                    will review your patch. If you need to submit a new version of the patch, leave
-                    the old one on the JIRA and add a version number to the name of the new
-                    patch.</para>
-            </step>
-            <step>
-                <para>After a change has been committed, there is no need to keep your local branch
-                    around. Instead you should run <command>git pull</command> to get the new change
-                    into your master branch.</para>
-            </step>
-        </procedure>
-    </section>
-
-    <section>
-        <title>Editing the HBase Website</title>
-        <para>The source for the HBase website is in the HBase source, in the
-                <filename>src/main/site/</filename> directory. Within this directory, source for the
-            individual pages is in the <filename>xdocs/</filename> directory, and images referenced
-            in those pages are in the <filename>images/</filename> directory. This directory also
-            stores images used in the HBase Reference Guide.</para>
-        <para>The website's pages are written in an HTML-like XML dialect called xdoc, which has a
-            reference guide at <link
-                xlink:href="http://maven.apache.org/archives/maven-1.x/plugins/xdoc/reference/xdocs.html"
-                >http://maven.apache.org/archives/maven-1.x/plugins/xdoc/reference/xdocs.html</link>.
-            You can edit these files in a plain-text editor, an IDE, or an XML editor such as
-            XML Mind XML Editor (XXE) or Oxygen XML Author. </para>
-        <para>To preview your changes, build the website using the <command>mvn clean site
-                -DskipTests</command> command. The HTML output resides in the
-                <filename>target/site/</filename> directory. When you are satisfied with your
-            changes, follow the procedure in <xref linkend="submit_doc_patch_procedure"/> to submit
-            your patch.</para>
-    </section>
-
-    <section>
-        <title>Editing the HBase Reference Guide</title>
-        <para>The source for the HBase Reference Guide is in the HBase source, in the
-                <filename>src/main/docbkx/</filename> directory. It is written in <link
-                xlink:href="http://www.docbook.org/">Docbook</link> XML. Docbook can be
-            intimidating, but you can typically follow the formatting of the surrounding file to get
-            an idea of the mark-up. You can edit Docbook XML files using a plain-text editor, an
-            XML-aware IDE, or a specialized XML editor.</para>
-        <para>Docbook's syntax can be picky. Before submitting a patch, be sure to build the output
-            locally using the <command>mvn site</command> command. If you do not get any build
-            errors, that means that the XML is well-formed, which means that each opening tag is
-            balanced by a closing tag. Well-formedness is not exactly the same as validity. Check
-            the output in <filename>target/docbkx/</filename> for any surprises before submitting a
-            patch.</para>
-    </section>
-
-    <section>
-        <title>Auto-Generated Content</title>
-        <para>Some parts of the HBase Reference Guide, most notably <xref linkend="config.files"/>,
-            are generated automatically, so that this area of the documentation stays in sync with
-            the code. This is done by means of an XSLT transform, which you can examine in the
-            source at <filename>src/main/xslt/configuration_to_docbook_section.xsl</filename>. This
-            transforms the <filename>hbase-common/src/main/resources/hbase-default.xml</filename>
-            file into a Docbook output which can be included in the Reference Guide. Sometimes, it
-            is necessary to add configuration parameters or modify their descriptions. Make the
-            modifications to the source file, and they will be included in the Reference Guide when
-            it is rebuilt.</para>
-        <para>It is possible that other types of content can and will be automatically generated
-            from HBase source files in the future.</para>
-    </section>
-
-    <section>
-        <title>Multi-Page and Single-Page Output</title>
-        <para>You can examine the <literal>site</literal> target in the Maven
-                <filename>pom.xml</filename> file included at the top level of the HBase source for
-            details on the process of building the website and documentation. The Reference Guide is
-            built twice, once as a single-page output and once with one HTML file per chapter. The
-            single-page output is located in <filename>target/docbkx/book.html</filename>, while the
-            multi-page output's index page is at <filename>target/docbkx/book/book.html</filename>.
-            Each of these outputs has its own <filename>images/</filename> and
-                <filename>css/</filename> directories, which are created at build time.</para>
-    </section>
-
-    <section>
-        <title>Images in the HBase Reference Guide</title>
-        <para>You can include images in the HBase Reference Guide. For accessibility reasons, it is
-            recommended that you use a &lt;figure&gt; Docbook element for an image. This allows
-            screen readers to navigate to the image and also provides alternative text for the
-            image. The following is an example of a &lt;figure&gt; element.</para>
-        <programlisting language="xml"><![CDATA[<figure>
-  <title>HFile Version 1</title>
-  <mediaobject>
-    <imageobject>
-      <imagedata fileref="timeline_consistency.png" />
-    </imageobject>
-    <textobject>
-      <phrase>HFile Version 1</phrase>
-    </textobject>
-  </mediaobject>
-</figure>]]>
-        </programlisting>
-        <para>The &lt;textobject&gt; can contain a few sentences describing the image, rather than
-            simply reiterating the title. You can optionally specify alignment and size options in
-            the &lt;imagedata&gt; element.</para>
-        <para>When doing a local build, save the image to the
-                <filename>src/main/site/resources/images/</filename> directory. In the
-            &lt;imagedata&gt; element, refer to the image as above, with no directory component. The
-            image will be copied to the appropriate target location during the build of the
-            output.</para>
-        <para>When you submit a patch which includes adding an image to the HBase Reference Guide,
-            attach the image to the JIRA. If the committer asks where the image should be committed,
-            it should go into the above directory.</para>
-    </section>
-
-    <section>
-        <title>Adding a New Chapter to the HBase Reference Guide</title>
-        <para>If you want to add a new chapter to the HBase Reference Guide, the easiest way is to
-            copy an existing chapter file, rename it, and change the ID and title elements near the
-            top of the file. Delete the existing content and create the new content. Then open the
-                <filename>book.xml</filename> file, which is the main file for the HBase Reference
-            Guide, and use an &lt;xi:include&gt; element to include your new chapter in the
-            appropriate location. Be sure to add your new file to your Git repository before
-            creating your patch. Note that the <filename>book.xml</filename> file currently contains
-            many chapters. You can only include a chapter at the same nesting levels as the other
-            chapters in the file. When in doubt, check to see how other files have been
-            included.</para>
-    </section>
-
-    <section>
-        <title>Docbook Common Issues</title>
-        <para>The following Docbook issues come up often. Some of these are preferences, but others
-            can create mysterious build errors or other problems.</para>
-        <qandaset>
-            <qandaentry>
-                <question>
-                    <para>What can go where?</para>
-                </question>
-                <answer>
-                    <para>There is often confusion about which child elements are valid in a given
-                        context. When in doubt, <link
-                            xlink:href="http://docbook.org/tdg/en/html/docbook.html">Docbook: The
-                            Definitive Guide</link> is the best resource. It has an appendix which
-                        is indexed by element and contains all valid child and parent elements of
-                        any given element. If you edit Docbook often, a schema-aware XML editor
-                        makes things easier.</para>
-                </answer>
-            </qandaentry>
-            <qandaentry>
-                <question>
-                    <para>Paragraphs and Admonitions</para>
-                </question>
-                <answer>
-                    <para>It is a common pattern, and it is technically valid, to put an admonition
-                        such as a &lt;note&gt; inside a &lt;para&gt; element. Because admonitions
-                        render as block-level elements (they take the whole width of the page), it
-                        is better to mark them up as siblings to the paragraphs around them, like
-                        this:</para>
-                    <programlisting language="xml"><![CDATA[<para>This is the paragraph.</para>
-<note>
-    <para>This is an admonition which occurs after the paragraph.</para>
-</note>]]></programlisting>
-                </answer>
-            </qandaentry>
-            <qandaentry>
-                <question>
-                    <para>Wrap textual &lt;listitem&gt; and &lt;entry&gt; contents in &lt;para&gt;
-                        elements.</para>
-                </question>
-                <answer>
-                    <para>Because the contents of a &lt;listitem&gt; (an element in an itemized,
-                        ordered, or variable list) or an &lt;entry&gt; (a cell in a table) can
-                        consist of things other than plain text, they need to be wrapped in some
-                        element. If they are plain text, they need to be inclosed in &lt;para&gt;
-                        tags. This is tedious but necessary for validity.</para>
-                    <programlisting language="xml"><![CDATA[<itemizedlist>
-    <listitem>
-        <para>This is a paragraph.</para>
-    </listitem>
-    <listitem>
-        <screen>This is screen output.</screen>
-    </listitem>
-</itemizedlist>]]></programlisting>
-                </answer>
-            </qandaentry>
-            <qandaentry>
-                <question>
-                    <para>When to use &lt;command&gt;, &lt;code&gt;, &lt;programlisting&gt;,
-                        &lt;screen&gt;</para>
-                </question>
-                <answer>
-                    <para>The first two are in-line tags, which can occur within the flow of
-                        paragraphs or titles. The second two are block elements.</para>
-                    <para>Use &lt;command&gt; to mention a command such as <command>hbase
-                            shell</command> in the flow of a sentence. Use &lt;code&gt; for other
-                        inline text referring to code. Incidentally, use &lt;literal&gt; to specify
-                        literal strings that should be typed or entered exactly as shown. Within a
-                        &lt;screen&gt; listing, it can be helpful to use the &lt;userinput&gt; and
-                        &lt;computeroutput&gt; elements to mark up the text further.</para>
-                    <para>Use &lt;screen&gt; to display input and output as the user would
-                            <emphasis>see</emphasis> it on the screen, in a log file, etc. Use
-                        &lt;programlisting&gt; only for blocks of code that occur within a file,
-                        such as Java or XML code, or a Bash shell script.</para>
-                </answer>
-            </qandaentry>
-            <qandaentry>
-                <question>
-                    <para>How to escape XML elements so that they show up as XML</para>
-                </question>
-                <answer>
-                    <para>For one-off instances or short in-line mentions, use the &amp;lt; and
-                        &amp;gt; encoded characters. For longer mentions, or blocks of code, enclose
-                        it with <![CDATA[&lt;![CDATA[]]&gt;]]>, which is much easier to maintain and
-                        parse in the source files..</para>
-                </answer>
-            </qandaentry>
-            <qandaentry>
-                <question>
-                    <para>Tips and tricks for making screen output look good</para>
-                </question>
-                <answer>
-                    <para>Text within &lt;screen&gt; and &lt;programlisting&gt; elements is shown
-                        exactly as it appears in the source, including indentation, tabs, and line
-                        wrap.</para>
-                    <itemizedlist>
-                        <listitem>
-                            <para>Indent the starting and closing XML elements, but do not indent
-                                the content. Also, to avoid having an extra blank line at the
-                                beginning of the programlisting output, do not put the CDATA
-                                element on its own line. For example:</para>
-                            <programlisting language="bourne"><![CDATA[        <programlisting>
-case $1 in
-  --cleanZk|--cleanHdfs|--cleanAll)
-    matches="yes" ;;
-  *) ;;
-esac
-        </programlisting>]]></programlisting>
-                        </listitem>
-                        <listitem>
-                            <para>After pasting code into a programlisting, fix the indentation
-                                manually, using two <emphasis>spaces</emphasis> per desired
-                                indentation. For screen output, be sure to include line breaks so
-                                that the text is no longer than 100 characters.</para>
-                        </listitem>
-                    </itemizedlist>
-                </answer>
-            </qandaentry>
-            <qandaentry>
-                <question>
-                    <para>Isolate Changes for Easy Diff Review.</para>
-                </question>
-                <answer>
-                    <para>Be careful with pretty-printing or re-formatting an entire XML file, even
-                        if the formatting has degraded over time. If you need to reformat a file, do
-                        that in a separate JIRA where you do not change any content. Be careful
-                        because some XML editors do a bulk-reformat when you open a new file,
-                        especially if you use GUI mode in the editor.</para>
-                </answer>
-            </qandaentry>
-            <qandaentry>
-                <question>
-                    <para>Syntax Highlighting</para>
-                </question>
-                <answer>
-                    <para>The HBase Reference Guide uses the <link
-                            xlink:href="http://sourceforge.net/projects/xslthl/files/xslthl/2.1.0/"
-                            >XSLT Syntax Highlighting</link> Maven module for syntax highlighting.
-                        To enable syntax highlighting for a given &lt;programlisting&gt; or
-                        &lt;screen&gt; (or possibly other elements), add the attribute
-                                <literal>language=<replaceable>LANGUAGE_OF_CHOICE</replaceable></literal>
-                        to the element, as in the following example:</para>
-                    <programlisting language="xml"><![CDATA[
-<programlisting language="xml">
-    <foo>bar</foo>
-    <bar>foo</bar>
-</programlisting>]]></programlisting>
-                    <para>Several syntax types are supported. The most interesting ones for the
-                        HBase Reference Guide are <literal>java</literal>, <literal>xml</literal>,
-                            <literal>sql</literal>, and <literal>bourne</literal> (for BASH shell
-                        output or Linux command-line examples).</para>
-                </answer>
-            </qandaentry>
-        </qandaset>
-    </section>
-</appendix>
-
-                      
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/hbase/blob/09a422e8/src/main/docbkx/appendix_hfile_format.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/appendix_hfile_format.xml b/src/main/docbkx/appendix_hfile_format.xml
deleted file mode 100644
index ee43031..0000000
--- a/src/main/docbkx/appendix_hfile_format.xml
+++ /dev/null
@@ -1,657 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<appendix version="5.0" xml:id="hfile_format"
-    xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink"
-    xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:svg="http://www.w3.org/2000/svg"
-    xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:html="http://www.w3.org/1999/xhtml"
-    xmlns:db="http://docbook.org/ns/docbook">
-    <!--
-/**
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
--->
-  <title>HFile format</title>
-  <para>This appendix describes the evolution of the HFile format.</para>
-
-  <section xml:id="hfilev1">
-    <title>HBase File Format (version 1)</title>
-    <para>As we will be discussing changes to the HFile format, it is useful to give a short overview of the original (HFile version 1) format.</para>
-    <section xml:id="hfilev1.overview">
-      <title>Overview of Version 1</title>
-      <para>An HFile in version 1 format is structured as follows:</para>
-      <figure>
-         <title>HFile V1 Format</title>
-         <mediaobject>
-            <imageobject>
-               <imagedata align="center" valign="middle" fileref="hfile.png"/>
-            </imageobject>
-            <textobject>
-               <phrase>HFile Version 1</phrase>
-            </textobject>
-            <caption><para>Image courtesy of Lars George, <link
-                     xlink:href="http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html"
-                     >hbase-architecture-101-storage.html</link>.</para></caption>
-         </mediaobject>
-      </figure>
-
-    </section>
-       <section><title> Block index format in version 1 </title>
-   <para>The block index in version 1 is very straightforward. For each entry, it contains: </para>
-   <orderedlist>
-      <listitem>
-         <para>Offset (long)</para>
-      </listitem>
-      <listitem>
-         <para>Uncompressed size (int)</para>
-      </listitem>
-      <listitem>
-         <para>Key (a serialized byte array written using Bytes.writeByteArray) </para>
-         <orderedlist>
-             <listitem>
-                 <para>Key length as a variable-length integer (VInt)
-                  </para>
-              </listitem>
-             <listitem>
-                 <para>
-                     Key bytes
-                 </para>
-             </listitem>
-         </orderedlist>
-      </listitem>
-   </orderedlist>
-   <para>The number of entries in the block index is stored in the fixed file trailer, and has to be passed in to the method that reads the block index. One of the limitations of the block index in version 1 is that it does not provide the compressed size of a block, which turns out to be necessary for decompression. Therefore, the HFile reader has to infer this compressed size from the offset difference between blocks. We fix this limitation in version 2, where we store on-disk block size instead of uncompressed size, and get uncompressed size from the block header.</para>
-    </section>
-  </section>
-  <section xml:id="hfilev2"><title>
-      HBase file format with inline blocks (version 2)
-      </title>
-   <para>Note:  this feature was introduced in HBase 0.92</para>
-   <section><title>Motivation </title>
-   <para>We found it necessary to revise the HFile format after encountering high memory usage and slow startup times caused by large Bloom filters and block indexes in the region server. Bloom filters can get as large as 100 MB per HFile, which adds up to 2 GB when aggregated over 20 regions. Block indexes can grow as large as 6 GB in aggregate size over the same set of regions. A region is not considered opened until all of its block index data is loaded. Large Bloom filters produce a different performance problem: the first get request that requires a Bloom filter lookup will incur the latency of loading the entire Bloom filter bit array.</para>
-   <para>To speed up region server startup we break Bloom filters and block indexes into multiple blocks and write those blocks out as they fill up, which also reduces the HFile writer’s memory footprint. In the Bloom filter case, “filling up a block” means accumulating enough keys to efficiently utilize a fixed-size bit array, and in the block index case we accumulate an “index block” of the desired size. Bloom filter blocks and index blocks (we call these “inline blocks”) become interspersed with data blocks, and as a side effect we can no longer rely on the difference between block offsets to determine data block length, as it was done in version 1.</para>
-   <para>HFile is a low-level file format by design, and it should not deal with application-specific details such as Bloom filters, which are handled at StoreFile level. Therefore, we call Bloom filter blocks in an HFile "inline" blocks. We also supply HFile with an interface to write those inline blocks. </para>
-   <para>Another format modification aimed at reducing the region server startup time is to use a contiguous “load-on-open” section that has to be loaded in memory at the time an HFile is being opened. Currently, as an HFile opens, there are separate seek operations to read the trailer, data/meta indexes, and file info. To read the Bloom filter, there are two more seek operations for its “data” and “meta” portions. In version 2, we seek once to read the trailer and seek again to read everything else we need to open the file from a contiguous block.</para></section>
-    <section xml:id="hfilev2.overview">
-      <title>Overview of Version 2</title>
-   <para>The version of HBase introducing the above features reads both version 1 and 2 HFiles, but only writes version 2 HFiles. A version 2 HFile is structured as follows:
-           <inlinemediaobject>
-               <imageobject>
-                   <imagedata align="center" valign="middle" fileref="hfilev2.png" />
-               </imageobject>
-               <textobject>
-                 <phrase>HFile Version 2</phrase>
-               </textobject>
-           </inlinemediaobject>
-
-   </para>
-   </section>
-   <section><title>Unified version 2 block format</title>
-   <para>In the version 2 every block in the data section contains the following fields: </para>
-   <orderedlist>
-      <listitem>
-         <para>8 bytes: Block type, a sequence of bytes equivalent to version 1's "magic records". Supported block types are: </para>
-         <orderedlist>
-             <listitem>
-                 <para>DATA – data blocks
-                  </para>
-              </listitem>
-             <listitem>
-                 <para>
-                     LEAF_INDEX – leaf-level index blocks in a multi-level-block-index
-                 </para>
-             </listitem>
-             <listitem>
-                 <para>
-                     BLOOM_CHUNK – Bloom filter chunks
-                  </para>
-              </listitem>
-             <listitem>
-                 <para>
-                     META – meta blocks (not used for Bloom filters in version 2 anymore)
-                  </para>
-              </listitem>
-             <listitem>
-                 <para>
-                     INTERMEDIATE_INDEX – intermediate-level index blocks in a multi-level blockindex
-                  </para>
-              </listitem>
-             <listitem>
-                 <para>
-                     ROOT_INDEX – root>level index blocks in a multi>level block index
-                  </para>
-              </listitem>
-             <listitem>
-                 <para>
-                     FILE_INFO – the “file info” block, a small key>value map of metadata
-                  </para>
-              </listitem>
-             <listitem>
-                 <para>
-                     BLOOM_META – a Bloom filter metadata block in the load>on>open section
-                  </para>
-              </listitem>
-             <listitem>
-                 <para>
-                     TRAILER – a fixed>size file trailer. As opposed to the above, this is not an
-                     HFile v2 block but a fixed>size (for each HFile version) data structure
-                  </para>
-              </listitem>
-             <listitem>
-                 <para>
-                      INDEX_V1 – this block type is only used for legacy HFile v1 block
-                  </para>
-              </listitem>
-         </orderedlist>
-      </listitem>
-      <listitem>
-         <para>Compressed size of the block's data, not including the header (int).
-         </para>
-                 <para>
-Can be used for skipping the current data block when scanning HFile data.
-                  </para>
-      </listitem>
-      <listitem>
-         <para>Uncompressed size of the block's data, not including the header (int)</para>
-                 <para>
- This is equal to the compressed size if the compression algorithm is NONE
-                  </para>
-      </listitem>
-      <listitem>
-         <para>File offset of the previous block of the same type (long)</para>
-                 <para>
- Can be used for seeking to the previous data/index block
-                  </para>
-      </listitem>
-      <listitem>
-         <para>Compressed data (or uncompressed data if the compression algorithm is NONE).</para>
-      </listitem>
-   </orderedlist>
-   <para>The above format of blocks is used in the following HFile sections:</para>
-   <orderedlist>
-      <listitem>
-         <para>Scanned block section. The section is named so because it contains all data blocks that need to be read when an HFile is scanned sequentially.  Also contains leaf block index and Bloom chunk blocks. </para>
-      </listitem>
-      <listitem>
-         <para>Non-scanned block section. This section still contains unified-format v2 blocks but it does not have to be read when doing a sequential scan. This section contains “meta” blocks and intermediate-level index blocks.
-         </para>
-      </listitem>
-   </orderedlist>
-   <para>We are supporting “meta” blocks in version 2 the same way they were supported in version 1, even though we do not store Bloom filter data in these blocks anymore. </para></section>
-
-<section><title> Block index in version 2</title>
-   <para>There are three types of block indexes in HFile version 2, stored in two different formats (root and non-root): </para>
-   <orderedlist>
-      <listitem>
-         <para>Data index — version 2 multi-level block index, consisting of:</para>
-         <orderedlist>
-          <listitem>
-             <para>
- Version 2 root index, stored in the data block index section of the file
-             </para>
-          </listitem>
-          <listitem>
-             <para>
-Optionally, version 2 intermediate levels, stored in the non%root format in   the data index section of the file.    Intermediate levels can only be present if leaf level blocks are present
-             </para>
-          </listitem>
-          <listitem>
-             <para>
-Optionally, version 2 leaf levels, stored in the non%root format inline with   data blocks
-             </para>
-          </listitem>
-      </orderedlist>
-      </listitem>
-      <listitem>
-         <para>Meta index — version 2 root index format only, stored in the meta index section of the file</para>
-      </listitem>
-      <listitem>
-         <para>Bloom index — version 2 root index format only, stored in the “load-on-open” section as part of Bloom filter metadata.</para>
-      </listitem>
-   </orderedlist></section>
-<section><title>
-      Root block index format in version 2</title>
-   <para>This format applies to:</para>
-   <orderedlist>
-      <listitem>
-         <para>Root level of the version 2 data index</para>
-      </listitem>
-      <listitem>
-         <para>Entire meta and Bloom indexes in version 2, which are always single-level. </para>
-      </listitem>
-   </orderedlist>
-   <para>A version 2 root index block is a sequence of entries of the following format, similar to entries of a version 1 block index, but storing on-disk size instead of uncompressed size. </para>
-   <orderedlist>
-      <listitem>
-         <para>Offset (long) </para>
-             <para>
-This offset may point to a data block or to a deeper>level index block.
-             </para>
-      </listitem>
-      <listitem>
-         <para>On-disk size (int) </para>
-      </listitem>
-      <listitem>
-         <para>Key (a serialized byte array stored using Bytes.writeByteArray) </para>
-         <orderedlist>
-          <listitem>
-             <para>Key (VInt)
-             </para>
-          </listitem>
-          <listitem>
-             <para>Key bytes
-             </para>
-          </listitem>
-      </orderedlist>
-      </listitem>
-   </orderedlist>
-   <para>A single-level version 2 block index consists of just a single root index block. To read a root index block of version 2, one needs to know the number of entries. For the data index and the meta index the number of entries is stored in the trailer, and for the Bloom index it is stored in the compound Bloom filter metadata.</para>
-
-   <para>For a multi-level block index we also store the following fields in the root index block in the load-on-open section of the HFile, in addition to the data structure described above:</para>
-   <orderedlist>
-      <listitem>
-         <para>Middle leaf index block offset</para>
-      </listitem>
-      <listitem>
-         <para>Middle leaf block on-disk size (meaning the leaf index block containing the reference to the “middle” data block of the file) </para>
-      </listitem>
-      <listitem>
-         <para>The index of the mid-key (defined below) in the middle leaf-level block.</para>
-      </listitem>
-   </orderedlist>
-   <para/>
-   <para>These additional fields are used to efficiently retrieve the mid-key of the HFile used in HFile splits, which we define as the first key of the block with a zero-based index of (n – 1) / 2, if the total number of blocks in the HFile is n. This definition is consistent with how the mid-key was determined in HFile version 1, and is reasonable in general, because blocks are likely to be the same size on average, but we don’t have any estimates on individual key/value pair sizes. </para>
-   <para/>
-   <para>When writing a version 2 HFile, the total number of data blocks pointed to by every leaf-level index block is kept track of. When we finish writing and the total number of leaf-level blocks is determined, it is clear which leaf-level block contains the mid-key, and the fields listed above are computed.  When reading the HFile and the mid-key is requested, we retrieve the middle leaf index block (potentially from the block cache) and get the mid-key value from the appropriate position inside that leaf block.</para></section>
-<section><title>
-      Non-root block index format in version 2</title>
-   <para>This format applies to intermediate-level and leaf index blocks of a version 2 multi-level data block index. Every non-root index block is structured as follows. </para>
-   <orderedlist>
-      <listitem>
-         <para>numEntries: the number of entries (int). </para>
-      </listitem>
-      <listitem>
-         <para>entryOffsets: the “secondary index” of offsets of entries in the block, to facilitate a quick binary search on the key (numEntries + 1 int values). The last value is the total length of all entries in this index block. For example, in a non-root index block with entry sizes 60, 80, 50 the “secondary index” will contain the following int array: {0, 60, 140, 190}.</para>
-      </listitem>
-      <listitem>
-         <para>Entries. Each entry contains: </para>
-         <orderedlist>
-          <listitem>
-             <para>
-Offset of the block referenced by this entry in the file (long)
-             </para>
-          </listitem>
-          <listitem>
-             <para>
-On>disk size of the referenced block (int)
-             </para>
-          </listitem>
-          <listitem>
-             <para>
-Key. The length can be calculated from entryOffsets.
-             </para>
-          </listitem>
-      </orderedlist>
-
-      </listitem>
-   </orderedlist></section><section><title>
-      Bloom filters in version 2</title>
-   <para>In contrast with version 1, in a version 2 HFile Bloom filter metadata is stored in the load-on-open section of the HFile for quick startup. </para>
-   <orderedlist>
-      <listitem>
-         <para>A compound Bloom filter. </para>
-         <orderedlist>
-          <listitem>
-             <para>
- Bloom filter version = 3 (int). There used to be a DynamicByteBloomFilter class that had the Bloom   filter version number 2
-             </para>
-          </listitem>
-          <listitem>
-             <para>
-The total byte size of all compound Bloom filter chunks (long)
-             </para>
-          </listitem>
-          <listitem>
-             <para>
- Number of hash functions (int
-             </para>
-          </listitem>
-          <listitem>
-             <para>
-Type of hash functions (int)
-             </para>
-          </listitem>
-          <listitem>
-             <para>
-The total key count inserted into the Bloom filter (long)
-             </para>
-          </listitem>
-          <listitem>
-             <para>
-The maximum total number of keys in the Bloom filter (long)
-             </para>
-          </listitem>
-          <listitem>
-             <para>
-The number of chunks (int)
-             </para>
-          </listitem>
-          <listitem>
-             <para>
-Comparator class used for Bloom filter keys, a UTF>8 encoded string stored   using Bytes.writeByteArray
-             </para>
-          </listitem>
-          <listitem>
-             <para>
- Bloom block index in the version 2 root block index format
-             </para>
-          </listitem>
-      </orderedlist>
-      </listitem>
-   </orderedlist></section><section><title>File Info format in versions 1 and 2</title>
-   <para>The file info block is a serialized <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/io/HbaseMapWritable.html">HbaseMapWritable</link> (essentially a map from byte arrays to byte arrays) with the following keys, among others. StoreFile-level logic adds more keys to this.</para>
-   <informaltable frame="all">
-      <tgroup cols="2"><tbody><row>
-            <entry>
-               <para>hfile.LASTKEY </para>
-            </entry>
-            <entry>
-               <para>The last key of the file (byte array) </para>
-            </entry>
-         </row>
-         <row>
-            <entry>
-               <para>hfile.AVG_KEY_LEN </para>
-            </entry>
-            <entry>
-               <para>The average key length in the file (int) </para>
-            </entry>
-         </row>
-         <row>
-            <entry>
-               <para>hfile.AVG_VALUE_LEN </para>
-            </entry>
-            <entry>
-               <para>The average value length in the file (int) </para>
-            </entry>
-         </row></tbody></tgroup>
-   </informaltable>
-   <para>File info format did not change in version 2. However, we moved the file info to the final section of the file, which can be loaded as one block at the time the HFile is being opened. Also, we do not store comparator in the version 2 file info anymore. Instead, we store it in the fixed file trailer. This is because we need to know the comparator at the time of parsing the load-on-open section of the HFile.</para></section><section><title>
-      Fixed file trailer format differences between versions 1 and 2</title>
-   <para>The following table shows common and different fields between fixed file trailers in versions 1 and 2. Note that the size of the trailer is different depending on the version, so it is “fixed” only within one version. However, the version is always stored as the last four-byte integer in the file. </para>
-   <para/>
-   <informaltable frame="all">
-      <tgroup cols="2">
-<colspec colname='c1'/>
-<colspec colname='c2'/>
-<tbody>
-    <row>
-            <entry>
-               <para>Version 1 </para>
-            </entry>
-            <entry>
-               <para>Version 2 </para>
-            </entry>
-         </row>
-         <row>
-            <entry align="center" namest="c1" nameend="c2">
-               <para>File info offset (long) </para>
-            </entry>
-         </row>
-         <row>
-            <entry>
-               <para>Data index offset (long) </para>
-            </entry>
-            <entry>
-                <para>loadOnOpenOffset (long)</para>
-                <para><emphasis>The offset of the section that we need toload when opening the file.</emphasis></para>
-            </entry>
-         </row>
-         <row>
-            <entry align="center" namest="c1" nameend="c2">
-               <para>Number of data index entries (int) </para>
-            </entry>
-         </row>
-         <row>
-            <entry>
-               <para>metaIndexOffset (long)</para>
-               <para>This field is not being used by the version 1 reader, so we removed it from version 2.</para>
-            </entry>
-            <entry>
-               <para>uncompressedDataIndexSize (long)</para>
-               <para>The total uncompressed size of the whole data block index, including root-level, intermediate-level, and leaf-level blocks.</para>
-            </entry>
-         </row>
-         <row>
-            <entry namest="c1" nameend="c2" align="center">
-               <para>Number of meta index entries (int) </para>
-            </entry>
-         </row>
-         <row>
-            <entry namest="c1" nameend="c2" align="center">
-               <para>Total uncompressed bytes (long) </para>
-            </entry>
-         </row>
-         <row>
-            <entry>
-               <para>numEntries (int) </para>
-            </entry>
-            <entry>
-               <para>numEntries (long) </para>
-            </entry>
-         </row>
-         <row>
-            <entry namest="c1" nameend="c2" align="center">
-               <para>Compression codec: 0 = LZO, 1 = GZ, 2 = NONE (int) </para>
-            </entry>
-         </row>
-         <row>
-            <entry>
-               <para></para>
-            </entry>
-            <entry>
-               <para>The number of levels in the data block index (int) </para>
-            </entry>
-         </row>
-         <row>
-            <entry>
-               <para></para>
-            </entry>
-            <entry>
-               <para>firstDataBlockOffset (long)</para>
-               <para>The offset of the first first data block. Used when scanning. </para>
-            </entry>
-         </row>
-         <row>
-            <entry>
-               <para></para>
-            </entry>
-            <entry>
-               <para>lastDataBlockEnd (long)</para>
-               <para>The offset of the first byte after the last key/value data block. We don't need to go beyond this offset when scanning. </para>
-            </entry>
-         </row>
-         <row>
-            <entry>
-               <para>Version: 1 (int) </para>
-            </entry>
-            <entry>
-               <para>Version: 2 (int) </para>
-            </entry>
-         </row></tbody></tgroup>
-   </informaltable>
-   <para/></section>
-   <section><title>getShortMidpointKey(an optimization for data index block)</title>
-     <para>Note: this optimization was introduced in HBase 0.95+</para>
-       <para>HFiles contain many blocks that contain a range of sorted Cells. Each cell has a key. To save IO when reading Cells, the HFile also has an index that maps a Cell's start key to the offset of the beginning of a particular block. Prior to this optimization, HBase would use the key of the first cell in each data block as the index key.</para>
-     <para>In HBASE-7845, we generate a new key that is lexicographically larger than the last key of the previous block and lexicographically equal or smaller than the start key of the current block. While actual keys can potentially be very long, this "fake key" or "virtual key" can be much shorter. For example, if the stop key of previous block is "the quick brown fox", the start key of current block is "the who", we could use "the r" as our virtual key in our hfile index.</para>
-     <para>There are two benefits to this:</para>
-     <itemizedlist>
-     <listitem><para>having shorter keys reduces the hfile index size, (allowing us to keep more indexes in memory), and</para></listitem>
-     <listitem><para>using something closer to the end key of the previous block allows us to avoid a potential extra IO when the target key lives in between the "virtual key" and the key of the first element in the target block.</para></listitem>
-     </itemizedlist>
-     <para>This optimization (implemented by the getShortMidpointKey method) is inspired by LevelDB's ByteWiseComparatorImpl::FindShortestSeparator() and FindShortSuccessor().</para>
-   </section>
-  </section>
-  <section xml:id="hfilev3">
-    <title>HBase File Format with Security Enhancements (version 3)</title>
-    <para>Note: this feature was introduced in HBase 0.98</para>
-    <section xml:id="hfilev3.motivation">
-      <title>Motivation </title>
-      <para>
-        Version 3 of HFile makes changes needed to ease management of encryption at rest and
-        cell-level metadata (which in turn is needed for cell-level ACLs and cell-level visibility
-        labels). For more information see <xref linkend="hbase.encryption.server"/>,
-        <xref linkend="hbase.tags"/>, <xref linkend="hbase.accesscontrol.configuration"/>, and
-        <xref linkend="hbase.visibility.labels"/>.
-      </para>
-    </section>
-    <section xml:id="hfilev3.overview">
-      <title>Overview</title>
-      <para>
-        The version of HBase introducing the above features reads HFiles in versions 1, 2, and 3 but
-        only writes version 3 HFiles. Version 3 HFiles are structured the same as version 2 HFiles.
-        For more information see <xref linkend="hfilev2.overview"/>.
-      </para>
-    </section>
-    <section xml:id="hvilev3.infoblock">
-      <title>File Info Block in Version 3</title>
-      <para>
-        Version 3 added two additional pieces of information to the reserved keys in the file info
-        block.
-        <informaltable frame="all">
-           <tgroup cols="2">
-             <tbody>
-              <row>
-                 <entry>
-                    <para>hfile.MAX_TAGS_LEN</para>
-                 </entry>
-                 <entry>
-                    <para>
-                      The maximum number of bytes needed to store the serialized tags for any single
-                      cell in this hfile (int)
-                    </para>
-                 </entry>
-              </row>
-               <row>
-                 <entry>
-                    <para>hfile.TAGS_COMPRESSED</para>
-                 </entry>
-                 <entry>
-                    <para>Does the block encoder for this hfile compress tags? (boolean)</para>
-                    <para>
-                      Should only be present if <classname>hfile.MAX_TAGS_LEN</classname> is also
-                      present.
-                    </para>
-                 </entry>
-              </row>
-            </tbody>
-          </tgroup>
-        </informaltable>
-      </para>
-      <para>
-        When reading a Version 3 HFile the presence of <classname>MAX_TAGS_LEN</classname> is used
-        to determine how to deserialize the cells within a data block. Therefore, consumers must
-        read the file's info block prior to reading any data blocks.
-      </para>
-      <para>
-        When writing a Version 3 HFile, HBase will always include <classname>MAX_TAGS_LEN
-        </classname> when flushing the memstore to underlying filesystem and when using prefix tree
-        encoding for data blocks, as described in <xref linkend="compression"/>. When compacting
-        extant files, the default writer will omit <classname>MAX_TAGS_LEN</classname> if all of the
-        files selected do not themselves contain any cells with tags. See
-        <xref linkend="compaction"/> for details on the compaction file selection algorithm.
-      </para>
-    </section>
-    <section xml:id="hfilev3.datablock">
-      <title>Data Blocks in Version 3</title>
-      <para>
-        Within an HFile, HBase cells are stored in data blocks as a sequence of KeyValues (see <xref
-        linkend="hfilev1.overview"/>, or <link xlink:href=
-        "http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html">Lars George's
-        excellent introduction to HBase Storage</link>). In version 3, these KeyValue optionally
-        will include a set of 0 or more tags:
-        <informaltable frame="all">
-          <tgroup cols="2">
-            <colspec colname='c1'/>
-            <colspec colname='c2'/>
-            <tbody>
-              <row>
-                <entry>
-                  <para>Version 1 &amp; 2</para>
-                  <para>Version 3 without MAX_TAGS_LEN</para>
-                </entry>
-                <entry><para>Version 3 with MAX_TAGS_LEN</para></entry>
-              </row>
-              <row>
-                <entry align="center" namest="c1" nameend="c2">
-                  <para>Key Length (4 bytes)</para>
-                </entry>
-              </row>
-              <row>
-                <entry align="center" namest="c1" nameend="c2">
-                  <para>Value Length (4 bytes)</para>
-                </entry>
-              </row>
-              <row>
-                <entry align="center" namest="c1" nameend="c2">
-                  <para>Key bytes (variable)</para>
-                </entry>
-              </row>
-              <row>
-                <entry align="center" namest="c1" nameend="c2">
-                  <para>Value bytes (variable)</para>
-                </entry>
-              </row>
-              <row>
-                <entry align="center" namest="c2" nameend="c2">
-                  <para>Tags Length (2 bytes)</para>
-                </entry>
-              </row>
-              <row>
-                <entry align="center" namest="c2" nameend="c2">
-                  <para>Tags bytes (variable)</para>
-                </entry>
-              </row>
-            </tbody>
-          </tgroup>
-        </informaltable>
-      </para>
-      <para>
-        If the info block for a given HFile contains an entry for
-        <classname>MAX_TAGS_LEN</classname> each cell will have the length of that cell's tags
-        included, even if that length is zero. The actual tags are stored as a sequence of tag
-        length (2 bytes), tag type (1 byte), tag bytes (variable). The format an individual tag's
-        bytes depends on the tag type.
-      </para>
-      <para>
-        Note that the dependence on the contents of the info block implies that prior to reading
-        any data blocks you must first process a file's info block. It also implies that prior to
-        writing a data block you must know if the file's info block will include
-        <classname>MAX_TAGS_LEN</classname>.
-      </para>
-    </section>
-    <section xml:id="hfilev3.fixedtrailer">
-      <title>Fixed File Trailer in Version 3</title>
-      <para>
-        The fixed file trailers written with HFile version 3 are always serialized with protocol
-        buffers. Additionally, it adds an optional field to the version 2 protocol buffer named
-        encryption_key. If HBase is configured to encrypt HFiles this field will store a data
-        encryption key for this particular HFile, encrypted with the current cluster master key
-        using AES. For more information see <xref linkend="hbase.encryption.server"/>.
-      </para>
-    </section>
-  </section>
-</appendix>


Mime
View raw message