hbase-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From e...@apache.org
Subject [07/34] hbase git commit: HBASE-12918 Backport asciidoc changes (apurtell and enis)
Date Wed, 28 Jan 2015 03:30:52 GMT
http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/docbkx/rpc.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/rpc.xml b/src/main/docbkx/rpc.xml
deleted file mode 100644
index 2e5dd5f..0000000
--- a/src/main/docbkx/rpc.xml
+++ /dev/null
@@ -1,301 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<appendix
-    xml:id="hbase.rpc"
-    version="5.0"
-    xmlns="http://docbook.org/ns/docbook"
-    xmlns:xlink="http://www.w3.org/1999/xlink"
-    xmlns:xi="http://www.w3.org/2001/XInclude"
-    xmlns:svg="http://www.w3.org/2000/svg"
-    xmlns:m="http://www.w3.org/1998/Math/MathML"
-    xmlns:html="http://www.w3.org/1999/xhtml"
-    xmlns:db="http://docbook.org/ns/docbook">
-    <!--/**
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
--->
-
-    <title>0.95 RPC Specification</title>
-    <para>In 0.95, all client/server communication is done with <link
-            xlink:href="https://code.google.com/p/protobuf/">protobuf’ed</link> Messages rather than
-        with <link
-            xlink:href="http://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/Writable.html">Hadoop
-            Writables</link>. Our RPC wire format therefore changes. This document describes the
-        client/server request/response protocol and our new RPC wire-format.</para>
-    <para />
-    <para>For what RPC is like in 0.94 and previous, see Benoît/Tsuna’s <link
-            xlink:href="https://github.com/OpenTSDB/asynchbase/blob/master/src/HBaseRpc.java#L164">Unofficial
-            Hadoop / HBase RPC protocol documentation</link>. For more background on how we arrived
-        at this spec., see <link
-            xlink:href="https://docs.google.com/document/d/1WCKwgaLDqBw2vpux0jPsAu2WPTRISob7HGCO8YhfDTA/edit#">HBase
-            RPC: WIP</link></para>
-    <para />
-    <section>
-        <title>Goals</title>
-        <para>
-            <orderedlist>
-                <listitem>
-                    <para>A wire-format we can evolve</para>
-                </listitem>
-                <listitem>
-                    <para>A format that does not require our rewriting server core or radically
-                        changing its current architecture (for later).</para>
-                </listitem>
-            </orderedlist>
-        </para>
-    </section>
-    <section>
-        <title>TODO</title>
-        <para>
-            <orderedlist>
-                <listitem>
-                    <para>List of problems with currently specified format and where we would like
-                        to go in a version2, etc. For example, what would we have to change if
-                        anything to move server async or to support streaming/chunking?</para>
-                </listitem>
-                <listitem>
-                    <para>Diagram on how it works</para>
-                </listitem>
-                <listitem>
-                    <para>A grammar that succinctly describes the wire-format. Currently we have
-                        these words and the content of the rpc protobuf idl but a grammar for the
-                        back and forth would help with groking rpc. Also, a little state machine on
-                        client/server interactions would help with understanding (and ensuring
-                        correct implementation).</para>
-                </listitem>
-            </orderedlist>
-        </para>
-    </section>
-    <section>
-        <title>RPC</title>
-        <para>The client will send setup information on connection establish. Thereafter, the client
-            invokes methods against the remote server sending a protobuf Message and receiving a
-            protobuf Message in response. Communication is synchronous. All back and forth is
-            preceded by an int that has the total length of the request/response. Optionally,
-            Cells(KeyValues) can be passed outside of protobufs in follow-behind Cell blocks
-            (because <link
-                xlink:href="https://docs.google.com/document/d/1WEtrq-JTIUhlnlnvA0oYRLp0F8MKpEBeBSCFcQiacdw/edit#">we
-                can’t protobuf megabytes of KeyValues</link> or Cells). These CellBlocks are encoded
-            and optionally compressed.</para>
-        <para />
-        <para>For more detail on the protobufs involved, see the <link
-                xlink:href="http://svn.apache.org/viewvc/hbase/trunk/hbase-protocol/src/main/protobuf/RPC.proto?view=markup">RPC.proto</link>
-            file in trunk.</para>
-
-        <section>
-            <title>Connection Setup</title>
-            <para>Client initiates connection.</para>
-            <section>
-                <title>Client</title>
-                <para>On connection setup, client sends a preamble followed by a connection header. </para>
-
-                <section>
-                    <title>&lt;preamble&gt;</title>
-                    <programlisting>&lt;MAGIC 4 byte integer&gt; &lt;1 byte RPC Format Version&gt; &lt;1 byte auth type&gt;</programlisting>
-                    <para> We need the auth method spec. here so the connection header is encoded if auth enabled.</para>
-                    <para>E.g.: HBas0x000x50 -- 4 bytes of MAGIC -- ‘HBas’ -- plus one-byte of
-                        version, 0 in this case, and one byte, 0x50 (SIMPLE). of an auth
-                        type.</para>
-                </section>
-
-                <section>
-                    <title>&lt;Protobuf ConnectionHeader Message&gt;</title>
-                    <para>Has user info, and “protocol”, as well as the encoders and compression the
-                        client will use sending CellBlocks. CellBlock encoders and compressors are
-                        for the life of the connection. CellBlock encoders implement
-                        org.apache.hadoop.hbase.codec.Codec. CellBlocks may then also be compressed.
-                        Compressors implement org.apache.hadoop.io.compress.CompressionCodec. This
-                        protobuf is written using writeDelimited so is prefaced by a pb varint with
-                        its serialized length</para>
-                </section>
-            </section>
-            <!--Client-->
-
-            <section>
-                <title>Server</title>
-                <para>After client sends preamble and connection header, server does NOT respond if
-                    successful connection setup. No response means server is READY to accept
-                    requests and to give out response. If the version or authentication in the
-                    preamble is not agreeable or the server has trouble parsing the preamble, it
-                    will throw a org.apache.hadoop.hbase.ipc.FatalConnectionException explaining the
-                    error and will then disconnect. If the client in the connection header -- i.e.
-                    the protobuf’d Message that comes after the connection preamble -- asks for for
-                    a Service the server does not support or a codec the server does not have, again
-                    we throw a FatalConnectionException with explanation.</para>
-            </section>
-        </section>
-
-        <section>
-            <title>Request</title>
-            <para>After a Connection has been set up, client makes requests. Server responds.</para>
-            <para>A request is made up of a protobuf RequestHeader followed by a protobuf Message
-                parameter. The header includes the method name and optionally, metadata on the
-                optional CellBlock that may be following. The parameter type suits the method being
-                invoked: i.e. if we are doing a getRegionInfo request, the protobuf Message param
-                will be an instance of GetRegionInfoRequest. The response will be a
-                GetRegionInfoResponse. The CellBlock is optionally used ferrying the bulk of the RPC
-                data: i.e Cells/KeyValues.</para>
-            <section>
-                <title>Request Parts</title>
-                <section>
-                    <title>&lt;Total Length&gt;</title>
-                    <para>The request is prefaced by an int that holds the total length of what
-                        follows.</para>
-                </section>
-                <section>
-                    <title>&lt;Protobuf RequestHeader Message&gt;</title>
-                    <para>Will have call.id, trace.id, and method name, etc. including optional
-                        Metadata on the Cell block IFF one is following. Data is protobuf’d inline
-                        in this pb Message or optionally comes in the following CellBlock</para>
-                </section>
-                <section>
-                    <title>&lt;Protobuf Param Message&gt;</title>
-                    <para>If the method being invoked is getRegionInfo, if you study the Service
-                        descriptor for the client to regionserver protocol, you will find that the
-                        request sends a GetRegionInfoRequest protobuf Message param in this
-                        position.</para>
-                </section>
-                <section>
-                    <title>&lt;CellBlock&gt;</title>
-                    <para>An encoded and optionally compressed Cell block.</para>
-                </section>
-            </section>
-            <!--Request parts-->
-        </section>
-        <!--Request-->
-
-        <section>
-            <title>Response</title>
-            <para>Same as Request, it is a protobuf ResponseHeader followed by a protobuf Message
-                response where the Message response type suits the method invoked. Bulk of the data
-                may come in a following CellBlock.</para>
-            <section>
-                <title>Response Parts</title>
-                <section>
-                    <title>&lt;Total Length&gt;</title>
-                    <para>The response is prefaced by an int that holds the total length of what
-                        follows.</para>
-                </section>
-                <section>
-                    <title>&lt;Protobuf ResponseHeader Message&gt;</title>
-                    <para>Will have call.id, etc. Will include exception if failed processing.
-                         Optionally includes metadata on optional, IFF there is a CellBlock
-                        following.</para>
-                </section>
-
-                <section>
-                    <title>&lt;Protobuf Response Message&gt;</title>
-                    <para>Return or may be nothing if exception. If the method being invoked is
-                        getRegionInfo, if you study the Service descriptor for the client to
-                        regionserver protocol, you will find that the response sends a
-                        GetRegionInfoResponse protobuf Message param in this position.</para>
-                </section>
-                <section>
-                    <title>&lt;CellBlock&gt;</title>
-                    <para>An encoded and optionally compressed Cell block.</para>
-                </section>
-            </section>
-            <!--Parts-->
-        </section>
-        <!--Response-->
-
-        <section>
-            <title>Exceptions</title>
-            <para>There are two distinct types. There is the request failed which is encapsulated
-                inside the response header for the response. The connection stays open to receive
-                new requests. The second type, the FatalConnectionException, kills the
-                connection.</para>
-            <para>Exceptions can carry extra information. See the ExceptionResponse protobuf type.
-                It has a flag to indicate do-no-retry as well as other miscellaneous payload to help
-                improve client responsiveness.</para>
-        </section>
-        <section>
-            <title>CellBlocks</title>
-            <para>These are not versioned. Server can do the codec or it cannot. If new version of a
-                codec with say, tighter encoding, then give it a new class name. Codecs will live on
-                the server for all time so old clients can connect.</para>
-        </section>
-    </section>
-
-
-    <section>
-        <title>Notes</title>
-        <section>
-            <title>Constraints</title>
-            <para>In some part, current wire-format -- i.e. all requests and responses preceeded by
-                a length -- has been dictated by current server non-async architecture.</para>
-        </section>
-        <section>
-            <title>One fat pb request or header+param</title>
-            <para>We went with pb header followed by pb param making a request and a pb header
-                followed by pb response for now. Doing header+param rather than a single protobuf
-                Message with both header and param content:</para>
-            <para>
-                <orderedlist>
-                    <listitem>
-                        <para>Is closer to what we currently have</para>
-                    </listitem>
-                    <listitem>
-                        <para>Having a single fat pb requires extra copying putting the already pb’d
-                            param into the body of the fat request pb (and same making
-                            result)</para>
-                    </listitem>
-                    <listitem>
-                        <para>We can decide whether to accept the request or not before we read the
-                            param; for example, the request might be low priority.  As is, we read
-                            header+param in one go as server is currently implemented so this is a
-                            TODO.</para>
-                    </listitem>
-                </orderedlist>
-            </para>
-            <para>The advantages are minor.  If later, fat request has clear advantage, can roll out
-                a v2 later.</para>
-        </section>
-        <section
-            xml:id="rpc.configs">
-            <title>RPC Configurations</title>
-            <section>
-                <title>CellBlock Codecs</title>
-                <para>To enable a codec other than the default <classname>KeyValueCodec</classname>,
-                    set <varname>hbase.client.rpc.codec</varname> to the name of the Codec class to
-                    use. Codec must implement hbase's <classname>Codec</classname> Interface. After
-                    connection setup, all passed cellblocks will be sent with this codec. The server
-                    will return cellblocks using this same codec as long as the codec is on the
-                    servers' CLASSPATH (else you will get
-                        <classname>UnsupportedCellCodecException</classname>).</para>
-                <para>To change the default codec, set
-                        <varname>hbase.client.default.rpc.codec</varname>. </para>
-                <para>To disable cellblocks completely and to go pure protobuf, set the default to
-                    the empty String and do not specify a codec in your Configuration. So, set
-                        <varname>hbase.client.default.rpc.codec</varname> to the empty string and do
-                    not set <varname>hbase.client.rpc.codec</varname>. This will cause the client to
-                    connect to the server with no codec specified. If a server sees no codec, it
-                    will return all responses in pure protobuf. Running pure protobuf all the time
-                    will be slower than running with cellblocks. </para>
-            </section>
-            <section>
-                <title>Compression</title>
-                <para>Uses hadoops compression codecs. To enable compressing of passed CellBlocks,
-                    set <varname>hbase.client.rpc.compressor</varname> to the name of the Compressor
-                    to use. Compressor must implement Hadoops' CompressionCodec Interface. After
-                    connection setup, all passed cellblocks will be sent compressed. The server will
-                    return cellblocks compressed using this same compressor as long as the
-                    compressor is on its CLASSPATH (else you will get
-                        <classname>UnsupportedCompressionCodecException</classname>).</para>
-            </section>
-        </section>
-    </section>
-</appendix>

http://git-wip-us.apache.org/repos/asf/hbase/blob/cb77a925/src/main/docbkx/schema_design.xml
----------------------------------------------------------------------
diff --git a/src/main/docbkx/schema_design.xml b/src/main/docbkx/schema_design.xml
deleted file mode 100644
index e4632ec..0000000
--- a/src/main/docbkx/schema_design.xml
+++ /dev/null
@@ -1,1262 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<chapter
-  version="5.0"
-  xml:id="schema"
-  xmlns="http://docbook.org/ns/docbook"
-  xmlns:xlink="http://www.w3.org/1999/xlink"
-  xmlns:xi="http://www.w3.org/2001/XInclude"
-  xmlns:svg="http://www.w3.org/2000/svg"
-  xmlns:m="http://www.w3.org/1998/Math/MathML"
-  xmlns:html="http://www.w3.org/1999/xhtml"
-  xmlns:db="http://docbook.org/ns/docbook">
-  <!--
-/**
- *
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
--->
-  <title>HBase and Schema Design</title>
-  <para>A good general introduction on the strength and weaknesses modelling on the various
-    non-rdbms datastores is Ian Varley's Master thesis, <link
-      xlink:href="http://ianvarley.com/UT/MR/Varley_MastersReport_Full_2009-08-07.pdf">No Relation:
-      The Mixed Blessings of Non-Relational Databases</link>. Recommended. Also, read <xref
-      linkend="keyvalue" /> for how HBase stores data internally, and the section on <xref
-      linkend="schema.casestudies" />. </para>
-  <section
-    xml:id="schema.creation">
-    <title> Schema Creation </title>
-    <para>HBase schemas can be created or updated with <xref
-        linkend="shell" /> or by using <link
-        xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html">HBaseAdmin</link>
-      in the Java API. </para>
-    <para>Tables must be disabled when making ColumnFamily modifications, for example:</para>
-    <programlisting language="java">
-Configuration config = HBaseConfiguration.create();
-HBaseAdmin admin = new HBaseAdmin(conf);
-String table = "myTable";
-
-admin.disableTable(table);
-
-HColumnDescriptor cf1 = ...;
-admin.addColumn(table, cf1);      // adding new ColumnFamily
-HColumnDescriptor cf2 = ...;
-admin.modifyColumn(table, cf2);    // modifying existing ColumnFamily
-
-admin.enableTable(table);
-    </programlisting>
-    <para>See <xref
-        linkend="client_dependencies" /> for more information about configuring client
-      connections.</para>
-    <para>Note: online schema changes are supported in the 0.92.x codebase, but the 0.90.x codebase
-      requires the table to be disabled. </para>
-    <section
-      xml:id="schema.updates">
-      <title>Schema Updates</title>
-      <para>When changes are made to either Tables or ColumnFamilies (e.g., region size, block
-        size), these changes take effect the next time there is a major compaction and the
-        StoreFiles get re-written. </para>
-      <para>See <xref
-          linkend="store" /> for more information on StoreFiles. </para>
-    </section>
-  </section>
-  <section
-    xml:id="number.of.cfs">
-    <title> On the number of column families </title>
-    <para> HBase currently does not do well with anything above two or three column families so keep
-      the number of column families in your schema low. Currently, flushing and compactions are done
-      on a per Region basis so if one column family is carrying the bulk of the data bringing on
-      flushes, the adjacent families will also be flushed though the amount of data they carry is
-      small. When many column families the flushing and compaction interaction can make for a bunch
-      of needless i/o loading (To be addressed by changing flushing and compaction to work on a per
-      column family basis). For more information on compactions, see <xref
-        linkend="compaction" />. </para>
-    <para>Try to make do with one column family if you can in your schemas. Only introduce a second
-      and third column family in the case where data access is usually column scoped; i.e. you query
-      one column family or the other but usually not both at the one time. </para>
-    <section
-      xml:id="number.of.cfs.card">
-      <title>Cardinality of ColumnFamilies</title>
-      <para>Where multiple ColumnFamilies exist in a single table, be aware of the cardinality
-        (i.e., number of rows). If ColumnFamilyA has 1 million rows and ColumnFamilyB has 1 billion
-        rows, ColumnFamilyA's data will likely be spread across many, many regions (and
-        RegionServers). This makes mass scans for ColumnFamilyA less efficient. </para>
-    </section>
-  </section>
-  <section
-    xml:id="rowkey.design">
-    <title>Rowkey Design</title>
-    <section>
-      <title>Hotspotting</title>
-      <para>Rows in HBase are sorted lexicographically by row key. This design optimizes for scans,
-        allowing you to store related rows, or rows that will be read together, near each other.
-        However, poorly designed row keys are a common source of <firstterm>hotspotting</firstterm>.
-        Hotspotting occurs when a large amount of client traffic is directed at one node, or only a
-        few nodes, of a cluster. This traffic may represent reads, writes, or other operations. The
-        traffic overwhelms the single machine responsible for hosting that region, causing
-        performance degradation and potentially leading to region unavailability. This can also have
-        adverse effects on other regions hosted by the same region server as that host is unable to
-        service the requested load. It is important to design data access patterns such that the
-        cluster is fully and evenly utilized.</para>
-      <para>To prevent hotspotting on writes, design your row keys such that rows that truly do need
-        to be in the same region are, but in the bigger picture, data is being written to multiple
-        regions across the cluster, rather than one at a time. Some common techniques for avoiding
-        hotspotting are described below, along with some of their advantages and drawbacks.</para>
-      <formalpara>
-        <title>Salting</title>
-        <para>Salting in this sense has nothing to do with cryptography, but refers to adding random
-          data to the start of a row key. In this case, salting refers to adding a randomly-assigned
-          prefix to the row key to cause it to sort differently than it otherwise would. The number
-          of possible prefixes correspond to the number of regions you want to spread the data
-          across. Salting can be helpful if you have a few "hot" row key patterns which come up over
-          and over amongst other more evenly-distributed rows. Consider the following example, which
-          shows that salting can spread write load across multiple regionservers, and illustrates
-          some of the negative implications for reads.</para>
-      </formalpara>
-      <example>
-        <title>Salting Example</title>
-        <para>Suppose you have the following list of row keys, and your table is split such that
-          there is one region for each letter of the alphabet. Prefix 'a' is one region, prefix 'b'
-          is another. In this table, all rows starting with 'f' are in the same region. This example
-          focuses on rows with keys like the following:</para>
-        <screen>
-foo0001
-foo0002
-foo0003
-foo0004          
-        </screen>
-        <para>Now, imagine that you would like to spread these across four different regions. You
-          decide to use four different salts: <literal>a</literal>, <literal>b</literal>,
-            <literal>c</literal>, and <literal>d</literal>. In this scenario, each of these letter
-          prefixes will be on a different region. After applying the salts, you have the following
-          rowkeys instead. Since you can now write to four separate regions, you theoretically have
-          four times the throughput when writing that you would have if all the writes were going to
-          the same region.</para>
-        <screen>
-a-foo0003
-b-foo0001
-c-foo0004
-d-foo0002          
-        </screen>
-        <para>Then, if you add another row, it will randomly be assigned one of the four possible
-          salt values and end up near one of the existing rows.</para>
-        <screen>
-a-foo0003
-b-foo0001
-<emphasis>c-foo0003</emphasis>
-c-foo0004
-d-foo0002        
-        </screen>
-        <para>Since this assignment will be random, you will need to do more work if you want to
-          retrieve the rows in lexicographic order. In this way, salting attempts to increase
-          throughput on writes, but has a cost during reads.</para>
-      </example>
-      <para></para>
-      <formalpara>
-        <title>Hashing</title>
-        <para>Instead of a random assignment, you could use a one-way <firstterm>hash</firstterm>
-          that would cause a given row to always be "salted" with the same prefix, in a way that
-          would spread the load across the regionservers, but allow for predictability during reads.
-          Using a deterministic hash allows the client to reconstruct the complete rowkey and use a
-          Get operation to retrieve that row as normal.</para>
-      </formalpara>
-      <example>
-        <title>Hashing Example</title>
-        <para>Given the same situation in the salting example above, you could instead apply a
-          one-way hash that would cause the row with key <literal>foo0003</literal> to always, and
-          predictably, receive the <literal>a</literal> prefix. Then, to retrieve that row, you
-          would already know the key. You could also optimize things so that certain pairs of keys
-          were always in the same region, for instance.</para>
-      </example>
-      <formalpara>
-        <title>Reversing the Key</title>
-        <para>A third common trick for preventing hotspotting is to reverse a fixed-width or numeric
-        row key so that the part that changes the most often (the least significant digit) is first.
-        This effectively randomizes row keys, but sacrifices row ordering properties.</para>
-      </formalpara>
-      <para>See <link
-          xlink:href="https://communities.intel.com/community/itpeernetwork/datastack/blog/2013/11/10/discussion-on-designing-hbase-tables"
-          >https://communities.intel.com/community/itpeernetwork/datastack/blog/2013/11/10/discussion-on-designing-hbase-tables</link>,
-        and <link xlink:href="http://phoenix.apache.org/salted.html">article on Salted Tables</link>
-        from the Phoenix project, and the discussion in the comments of <link
-          xlink:href="https://issues.apache.org/jira/browse/HBASE-11682">HBASE-11682</link> for more
-        information about avoiding hotspotting.</para>
-    </section>
-    <section
-      xml:id="timeseries">
-      <title> Monotonically Increasing Row Keys/Timeseries Data </title>
-      <para> In the HBase chapter of Tom White's book <link
-          xlink:href="http://oreilly.com/catalog/9780596521981">Hadoop: The Definitive Guide</link>
-        (O'Reilly) there is a an optimization note on watching out for a phenomenon where an import
-        process walks in lock-step with all clients in concert pounding one of the table's regions
-        (and thus, a single node), then moving onto the next region, etc. With monotonically
-        increasing row-keys (i.e., using a timestamp), this will happen. See this comic by IKai Lan
-        on why monotonically increasing row keys are problematic in BigTable-like datastores: <link
-          xlink:href="http://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/">monotonically
-          increasing values are bad</link>. The pile-up on a single region brought on by
-        monotonically increasing keys can be mitigated by randomizing the input records to not be in
-        sorted order, but in general it's best to avoid using a timestamp or a sequence (e.g. 1, 2,
-        3) as the row-key. </para>
-      <para>If you do need to upload time series data into HBase, you should study <link
-          xlink:href="http://opentsdb.net/">OpenTSDB</link> as a successful example. It has a page
-        describing the <link
-          xlink:href=" http://opentsdb.net/schema.html">schema</link> it uses in HBase. The key
-        format in OpenTSDB is effectively [metric_type][event_timestamp], which would appear at
-        first glance to contradict the previous advice about not using a timestamp as the key.
-        However, the difference is that the timestamp is not in the <emphasis>lead</emphasis>
-        position of the key, and the design assumption is that there are dozens or hundreds (or
-        more) of different metric types. Thus, even with a continual stream of input data with a mix
-        of metric types, the Puts are distributed across various points of regions in the table. </para>
-      <para>See <xref
-          linkend="schema.casestudies" /> for some rowkey design examples. </para>
-    </section>
-    <section
-      xml:id="keysize">
-      <title>Try to minimize row and column sizes</title>
-      <subtitle>Or why are my StoreFile indices large?</subtitle>
-      <para>In HBase, values are always freighted with their coordinates; as a cell value passes
-        through the system, it'll be accompanied by its row, column name, and timestamp - always. If
-        your rows and column names are large, especially compared to the size of the cell value,
-        then you may run up against some interesting scenarios. One such is the case described by
-        Marc Limotte at the tail of <link
-          xlink:href="https://issues.apache.org/jira/browse/HBASE-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=13005272#comment-13005272">HBASE-3551</link>
-        (recommended!). Therein, the indices that are kept on HBase storefiles (<xref
-          linkend="hfile" />) to facilitate random access may end up occupyng large chunks of the
-        HBase allotted RAM because the cell value coordinates are large. Mark in the above cited
-        comment suggests upping the block size so entries in the store file index happen at a larger
-        interval or modify the table schema so it makes for smaller rows and column names.
-        Compression will also make for larger indices. See the thread <link
-          xlink:href="http://search-hadoop.com/m/hemBv1LiN4Q1/a+question+storefileIndexSize&amp;subj=a+question+storefileIndexSize">a
-          question storefileIndexSize</link> up on the user mailing list. </para>
-      <para>Most of the time small inefficiencies don't matter all that much. Unfortunately, this is
-        a case where they do. Whatever patterns are selected for ColumnFamilies, attributes, and
-        rowkeys they could be repeated several billion times in your data. </para>
-      <para>See <xref
-          linkend="keyvalue" /> for more information on HBase stores data internally to see why this
-        is important.</para>
-      <section
-        xml:id="keysize.cf">
-        <title>Column Families</title>
-        <para>Try to keep the ColumnFamily names as small as possible, preferably one character
-          (e.g. "d" for data/default). </para>
-        <para>See <xref
-            linkend="keyvalue" /> for more information on HBase stores data internally to see why
-          this is important.</para>
-      </section>
-      <section
-        xml:id="keysize.attributes">
-        <title>Attributes</title>
-        <para>Although verbose attribute names (e.g., "myVeryImportantAttribute") are easier to
-          read, prefer shorter attribute names (e.g., "via") to store in HBase. </para>
-        <para>See <xref
-            linkend="keyvalue" /> for more information on HBase stores data internally to see why
-          this is important.</para>
-      </section>
-      <section
-        xml:id="keysize.row">
-        <title>Rowkey Length</title>
-        <para>Keep them as short as is reasonable such that they can still be useful for required
-          data access (e.g., Get vs. Scan). A short key that is useless for data access is not
-          better than a longer key with better get/scan properties. Expect tradeoffs when designing
-          rowkeys. </para>
-      </section>
-      <section
-        xml:id="keysize.patterns">
-        <title>Byte Patterns</title>
-        <para>A long is 8 bytes. You can store an unsigned number up to 18,446,744,073,709,551,615
-          in those eight bytes. If you stored this number as a String -- presuming a byte per
-          character -- you need nearly 3x the bytes. </para>
-        <para>Not convinced? Below is some sample code that you can run on your own.</para>
-        <programlisting language="java">
-// long
-//
-long l = 1234567890L;
-byte[] lb = Bytes.toBytes(l);
-System.out.println("long bytes length: " + lb.length);   // returns 8
-
-String s = "" + l;
-byte[] sb = Bytes.toBytes(s);
-System.out.println("long as string length: " + sb.length);    // returns 10
-
-// hash
-//
-MessageDigest md = MessageDigest.getInstance("MD5");
-byte[] digest = md.digest(Bytes.toBytes(s));
-System.out.println("md5 digest bytes length: " + digest.length);    // returns 16
-
-String sDigest = new String(digest);
-byte[] sbDigest = Bytes.toBytes(sDigest);
-System.out.println("md5 digest as string length: " + sbDigest.length);    // returns 26
-        </programlisting>
-        <para>Unfortunately, using a binary representation of a type will make your data harder to
-          read outside of your code. For example, this is what you will see in the shell when you
-          increment a value:</para>
-        <programlisting>
-hbase(main):001:0> incr 't', 'r', 'f:q', 1
-COUNTER VALUE = 1
-
-hbase(main):002:0> get 't', 'r'
-COLUMN                                        CELL
- f:q                                          timestamp=1369163040570, value=\x00\x00\x00\x00\x00\x00\x00\x01
-1 row(s) in 0.0310 seconds
-        </programlisting>
-        <para>The shell makes a best effort to print a string, and it this case it decided to just
-          print the hex. The same will happen to your row keys inside the region names. It can be
-          okay if you know what's being stored, but it might also be unreadable if arbitrary data
-          can be put in the same cells. This is the main trade-off. </para>
-      </section>
-
-    </section>
-    <section
-      xml:id="reverse.timestamp">
-      <title>Reverse Timestamps</title>
-      <note>
-        <title>Reverse Scan API</title>
-        <para>
-          <link
-            xlink:href="https://issues.apache.org/jira/browse/HBASE-4811">HBASE-4811</link>
-          implements an API to scan a table or a range within a table in reverse, reducing the need
-          to optimize your schema for forward or reverse scanning. This feature is available in
-          HBase 0.98 and later. See <link
-            xlink:href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setReversed%28boolean" />
-          for more information. </para>
-      </note>
-
-      <para>A common problem in database processing is quickly finding the most recent version of a
-        value. A technique using reverse timestamps as a part of the key can help greatly with a
-        special case of this problem. Also found in the HBase chapter of Tom White's book Hadoop:
-        The Definitive Guide (O'Reilly), the technique involves appending (<code>Long.MAX_VALUE -
-          timestamp</code>) to the end of any key, e.g., [key][reverse_timestamp]. </para>
-      <para>The most recent value for [key] in a table can be found by performing a Scan for [key]
-        and obtaining the first record. Since HBase keys are in sorted order, this key sorts before
-        any older row-keys for [key] and thus is first. </para>
-      <para>This technique would be used instead of using <xref
-          linkend="schema.versions" /> where the intent is to hold onto all versions "forever" (or a
-        very long time) and at the same time quickly obtain access to any other version by using the
-        same Scan technique. </para>
-    </section>
-    <section
-      xml:id="rowkey.scope">
-      <title>Rowkeys and ColumnFamilies</title>
-      <para>Rowkeys are scoped to ColumnFamilies. Thus, the same rowkey could exist in each
-        ColumnFamily that exists in a table without collision. </para>
-    </section>
-    <section
-      xml:id="changing.rowkeys">
-      <title>Immutability of Rowkeys</title>
-      <para>Rowkeys cannot be changed. The only way they can be "changed" in a table is if the row
-        is deleted and then re-inserted. This is a fairly common question on the HBase dist-list so
-        it pays to get the rowkeys right the first time (and/or before you've inserted a lot of
-        data). </para>
-    </section>
-    <section
-      xml:id="rowkey.regionsplits">
-      <title>Relationship Between RowKeys and Region Splits</title>
-      <para>If you pre-split your table, it is <emphasis>critical</emphasis> to understand how your
-        rowkey will be distributed across the region boundaries. As an example of why this is
-        important, consider the example of using displayable hex characters as the lead position of
-        the key (e.g., "0000000000000000" to "ffffffffffffffff"). Running those key ranges through
-          <code>Bytes.split</code> (which is the split strategy used when creating regions in
-          <code>HBaseAdmin.createTable(byte[] startKey, byte[] endKey, numRegions)</code> for 10
-        regions will generate the following splits...</para>
-      <screen>
-48 48 48 48 48 48 48 48 48 48 48 48 48 48 48 48                                // 0
-54 -10 -10 -10 -10 -10 -10 -10 -10 -10 -10 -10 -10 -10 -10 -10                 // 6
-61 -67 -67 -67 -67 -67 -67 -67 -67 -67 -67 -67 -67 -67 -67 -68                 // =
-68 -124 -124 -124 -124 -124 -124 -124 -124 -124 -124 -124 -124 -124 -124 -126  // D
-75 75 75 75 75 75 75 75 75 75 75 75 75 75 75 72                                // K
-82 18 18 18 18 18 18 18 18 18 18 18 18 18 18 14                                // R
-88 -40 -40 -40 -40 -40 -40 -40 -40 -40 -40 -40 -40 -40 -40 -44                 // X
-95 -97 -97 -97 -97 -97 -97 -97 -97 -97 -97 -97 -97 -97 -97 -102                // _
-102 102 102 102 102 102 102 102 102 102 102 102 102 102 102 102                // f
-      </screen>
-      <para>... (note: the lead byte is listed to the right as a comment.) Given that the first
-        split is a '0' and the last split is an 'f', everything is great, right? Not so fast. </para>
-      <para>The problem is that all the data is going to pile up in the first 2 regions and the last
-        region thus creating a "lumpy" (and possibly "hot") region problem. To understand why, refer
-        to an <link
-          xlink:href="http://www.asciitable.com">ASCII Table</link>. '0' is byte 48, and 'f' is byte
-        102, but there is a huge gap in byte values (bytes 58 to 96) that will <emphasis>never
-          appear in this keyspace</emphasis> because the only values are [0-9] and [a-f]. Thus, the
-        middle regions regions will never be used. To make pre-spliting work with this example
-        keyspace, a custom definition of splits (i.e., and not relying on the built-in split method)
-        is required. </para>
-      <para>Lesson #1: Pre-splitting tables is generally a best practice, but you need to pre-split
-        them in such a way that all the regions are accessible in the keyspace. While this example
-        demonstrated the problem with a hex-key keyspace, the same problem can happen with
-          <emphasis>any</emphasis> keyspace. Know your data. </para>
-      <para>Lesson #2: While generally not advisable, using hex-keys (and more generally,
-        displayable data) can still work with pre-split tables as long as all the created regions
-        are accessible in the keyspace. </para>
-      <para>To conclude this example, the following is an example of how appropriate splits can be
-        pre-created for hex-keys:. </para>
-      <programlisting language="java"><![CDATA[public static boolean createTable(HBaseAdmin admin, HTableDescriptor table, byte[][] splits)
-throws IOException {
-  try {
-    admin.createTable( table, splits );
-    return true;
-  } catch (TableExistsException e) {
-    logger.info("table " + table.getNameAsString() + " already exists");
-    // the table already exists...
-    return false;
-  }
-}
-
-public static byte[][] getHexSplits(String startKey, String endKey, int numRegions) {
-  byte[][] splits = new byte[numRegions-1][];
-  BigInteger lowestKey = new BigInteger(startKey, 16);
-  BigInteger highestKey = new BigInteger(endKey, 16);
-  BigInteger range = highestKey.subtract(lowestKey);
-  BigInteger regionIncrement = range.divide(BigInteger.valueOf(numRegions));
-  lowestKey = lowestKey.add(regionIncrement);
-  for(int i=0; i < numRegions-1;i++) {
-    BigInteger key = lowestKey.add(regionIncrement.multiply(BigInteger.valueOf(i)));
-    byte[] b = String.format("%016x", key).getBytes();
-    splits[i] = b;
-  }
-  return splits;
-}]]></programlisting>
-    </section>
-  </section>
-  <!--  rowkey design -->
-  <section
-    xml:id="schema.versions">
-    <title> Number of Versions </title>
-    <section
-      xml:id="schema.versions.max">
-      <title>Maximum Number of Versions</title>
-      <para>The maximum number of row versions to store is configured per column family via <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html">HColumnDescriptor</link>.
-        The default for max versions is 1. This is an important parameter because as described in <xref
-          linkend="datamodel" /> section HBase does <emphasis>not</emphasis> overwrite row values,
-        but rather stores different values per row by time (and qualifier). Excess versions are
-        removed during major compactions. The number of max versions may need to be increased or
-        decreased depending on application needs. </para>
-      <para>It is not recommended setting the number of max versions to an exceedingly high level
-        (e.g., hundreds or more) unless those old values are very dear to you because this will
-        greatly increase StoreFile size. </para>
-    </section>
-    <section
-      xml:id="schema.minversions">
-      <title> Minimum Number of Versions </title>
-      <para>Like maximum number of row versions, the minimum number of row versions to keep is
-        configured per column family via <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html">HColumnDescriptor</link>.
-        The default for min versions is 0, which means the feature is disabled. The minimum number
-        of row versions parameter is used together with the time-to-live parameter and can be
-        combined with the number of row versions parameter to allow configurations such as "keep the
-        last T minutes worth of data, at most N versions, <emphasis>but keep at least M versions
-          around</emphasis>" (where M is the value for minimum number of row versions, M&lt;N). This
-        parameter should only be set when time-to-live is enabled for a column family and must be
-        less than the number of row versions. </para>
-    </section>
-  </section>
-  <section
-    xml:id="supported.datatypes">
-    <title> Supported Datatypes </title>
-    <para>HBase supports a "bytes-in/bytes-out" interface via <link
-        xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Put.html">Put</link>
-      and <link
-        xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Result.html">Result</link>,
-      so anything that can be converted to an array of bytes can be stored as a value. Input could
-      be strings, numbers, complex objects, or even images as long as they can rendered as bytes. </para>
-    <para>There are practical limits to the size of values (e.g., storing 10-50MB objects in HBase
-      would probably be too much to ask); search the mailling list for conversations on this topic.
-      All rows in HBase conform to the <xref
-        linkend="datamodel" />, and that includes versioning. Take that into consideration when
-      making your design, as well as block size for the ColumnFamily. </para>
-
-    <section
-      xml:id="counters">
-      <title>Counters</title>
-      <para> One supported datatype that deserves special mention are "counters" (i.e., the ability
-        to do atomic increments of numbers). See <link
-          xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#increment%28org.apache.hadoop.hbase.client.Increment%29">Increment</link>
-        in HTable. </para>
-      <para>Synchronization on counters are done on the RegionServer, not in the client. </para>
-    </section>
-  </section>
-  <section
-    xml:id="schema.joins">
-    <title>Joins</title>
-    <para>If you have multiple tables, don't forget to factor in the potential for <xref
-        linkend="joins" /> into the schema design. </para>
-  </section>
-  <section
-    xml:id="ttl">
-    <title>Time To Live (TTL)</title>
-    <para> ColumnFamilies can set a TTL length in seconds, and HBase will automatically delete rows
-      once the expiration time is reached. This applies to <emphasis>all</emphasis> versions of a
-      row - even the current one. The TTL time encoded in the HBase for the row is specified in UTC.
-    </para>
-    <para> Store files which contains only expired rows are deleted on minor compaction. Setting
-        <varname>hbase.store.delete.expired.storefile</varname> to <code>false</code> disables this
-      feature. Setting <link linkend="schema.minversions">minimum number of versions</link> to other
-      than 0 also disables this.</para>
-    <para> See <link
-        xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html"
-        >HColumnDescriptor</link> for more information. </para>
-    <para>Recent versions of HBase also support setting time to live on a per cell basis. See <link
-        xlink:href="https://issues.apache.org/jira/browse/HBASE-10560">HBASE-10560</link> for more
-        information. Cell TTLs are submitted as an attribute on mutation requests (Appends,
-        Increments, Puts, etc.) using Mutation#setTTL. If the TTL attribute is set, it will be applied
-        to all cells updated on the server by the operation. There are two notable differences
-        between cell TTL handling and ColumnFamily TTLs:</para>
-    <itemizedlist>
-      <listitem>
-        <para>Cell TTLs are expressed in units of milliseconds instead of seconds.</para>
-      </listitem>
-      <listitem>
-        <para>A cell TTLs cannot extend the effective lifetime of a cell beyond a ColumnFamily level
-          TTL setting.</para>
-      </listitem>
-    </itemizedlist>
-  </section>
-  <section
-    xml:id="cf.keep.deleted">
-    <title> Keeping Deleted Cells </title>
-    <para>By default, delete markers extend back to the beginning of time. Therefore, <link
-        xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html">Get</link>
-      or <link
-        xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">Scan</link>
-      operations will not see a deleted cell (row or column), even when the Get or Scan operation
-      indicates a time range
-      before the delete marker was placed.</para> 
-    <para>ColumnFamilies can optionally keep deleted cells. In this case, deleted cells can still be
-      retrieved, as long as these operations specify a time range that ends before the timestamp of
-      any delete that would affect the cells. This allows for point-in-time queries even in the
-      presence of deletes. </para>
-    <para> Deleted cells are still subject to TTL and there will never be more than "maximum number
-      of versions" deleted cells. A new "raw" scan options returns all deleted rows and the delete
-      markers. </para>
-    <example>
-      <title>Change the Value of <code>KEEP_DELETED_CELLS</code> Using HBase Shell</title>
-      <screen>hbase> hbase> alter ‘t1′, NAME => ‘f1′, KEEP_DELETED_CELLS => true</screen>
-    </example>
-    <example>
-      <title>Change the Value of <code>KEEP_DELETED_CELLS</code> Using the API</title>
-      <programlisting language="java">...
-HColumnDescriptor.setKeepDeletedCells(true);
-...
-      </programlisting>
-    </example>
-    <para>See the API documentation for <link
-        xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HColumnDescriptor.html#KEEP_DELETED_CELLS"
-        >KEEP_DELETED_CELLS</link> for more information. </para>
-  </section>
-  <section
-    xml:id="secondary.indexes">
-    <title> Secondary Indexes and Alternate Query Paths </title>
-    <para>This section could also be titled "what if my table rowkey looks like
-        <emphasis>this</emphasis> but I also want to query my table like <emphasis>that</emphasis>."
-      A common example on the dist-list is where a row-key is of the format "user-timestamp" but
-      there are reporting requirements on activity across users for certain time ranges. Thus,
-      selecting by user is easy because it is in the lead position of the key, but time is not. </para>
-    <para>There is no single answer on the best way to handle this because it depends on... </para>
-    <itemizedlist>
-      <listitem>
-        <para>Number of users</para>
-      </listitem>
-      <listitem>
-        <para>Data size and data arrival rate</para>
-      </listitem>
-      <listitem>
-        <para>Flexibility of reporting requirements (e.g., completely ad-hoc date selection vs.
-          pre-configured ranges) </para>
-      </listitem>
-      <listitem>
-        <para>Desired execution speed of query (e.g., 90 seconds may be reasonable to some for an
-          ad-hoc report, whereas it may be too long for others) </para>
-      </listitem>
-    </itemizedlist>
-    <para>... and solutions are also influenced by the size of the cluster and how much processing
-      power you have to throw at the solution. Common techniques are in sub-sections below. This is
-      a comprehensive, but not exhaustive, list of approaches. </para>
-    <para>It should not be a surprise that secondary indexes require additional cluster space and
-      processing. This is precisely what happens in an RDBMS because the act of creating an
-      alternate index requires both space and processing cycles to update. RDBMS products are more
-      advanced in this regard to handle alternative index management out of the box. However, HBase
-      scales better at larger data volumes, so this is a feature trade-off. </para>
-    <para>Pay attention to <xref
-        linkend="performance" /> when implementing any of these approaches.</para>
-    <para>Additionally, see the David Butler response in this dist-list thread <link
-        xlink:href="http://search-hadoop.com/m/nvbiBp2TDP/Stargate%252Bhbase&amp;subj=Stargate+hbase">HBase,
-        mail # user - Stargate+hbase</link>
-    </para>
-    <section
-      xml:id="secondary.indexes.filter">
-      <title> Filter Query </title>
-      <para>Depending on the case, it may be appropriate to use <xref
-          linkend="client.filter" />. In this case, no secondary index is created. However, don't
-        try a full-scan on a large table like this from an application (i.e., single-threaded
-        client). </para>
-    </section>
-    <section
-      xml:id="secondary.indexes.periodic">
-      <title> Periodic-Update Secondary Index </title>
-      <para>A secondary index could be created in an other table which is periodically updated via a
-        MapReduce job. The job could be executed intra-day, but depending on load-strategy it could
-        still potentially be out of sync with the main data table.</para>
-      <para>See <xref
-          linkend="mapreduce.example.readwrite" /> for more information.</para>
-    </section>
-    <section
-      xml:id="secondary.indexes.dualwrite">
-      <title> Dual-Write Secondary Index </title>
-      <para>Another strategy is to build the secondary index while publishing data to the cluster
-        (e.g., write to data table, write to index table). If this is approach is taken after a data
-        table already exists, then bootstrapping will be needed for the secondary index with a
-        MapReduce job (see <xref
-          linkend="secondary.indexes.periodic" />).</para>
-    </section>
-    <section
-      xml:id="secondary.indexes.summary">
-      <title> Summary Tables </title>
-      <para>Where time-ranges are very wide (e.g., year-long report) and where the data is
-        voluminous, summary tables are a common approach. These would be generated with MapReduce
-        jobs into another table.</para>
-      <para>See <xref
-          linkend="mapreduce.example.summary" /> for more information.</para>
-    </section>
-    <section
-      xml:id="secondary.indexes.coproc">
-      <title> Coprocessor Secondary Index </title>
-      <para>Coprocessors act like RDBMS triggers. These were added in 0.92. For more information,
-        see <xref
-          linkend="coprocessors" />
-      </para>
-    </section>
-  </section>
-  <section
-    xml:id="constraints">
-    <title>Constraints</title>
-    <para>HBase currently supports 'constraints' in traditional (SQL) database parlance. The advised
-      usage for Constraints is in enforcing business rules for attributes in the table (eg. make
-      sure values are in the range 1-10). Constraints could also be used to enforce referential
-      integrity, but this is strongly discouraged as it will dramatically decrease the write
-      throughput of the tables where integrity checking is enabled. Extensive documentation on using
-      Constraints can be found at: <link
-        xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/constraint">Constraint</link>
-      since version 0.94. </para>
-  </section>
-  <section
-    xml:id="schema.casestudies">
-    <title>Schema Design Case Studies</title>
-    <para>The following will describe some typical data ingestion use-cases with HBase, and how the
-      rowkey design and construction can be approached. Note: this is just an illustration of
-      potential approaches, not an exhaustive list. Know your data, and know your processing
-      requirements. </para>
-    <para>It is highly recommended that you read the rest of the <xref
-        linkend="schema" /> first, before reading these case studies. </para>
-    <para>The following case studies are described: </para>
-    <itemizedlist>
-      <listitem>
-        <para>Log Data / Timeseries Data</para>
-      </listitem>
-      <listitem>
-        <para>Log Data / Timeseries on Steroids</para>
-      </listitem>
-      <listitem>
-        <para>Customer/Order</para>
-      </listitem>
-      <listitem>
-        <para>Tall/Wide/Middle Schema Design</para>
-      </listitem>
-      <listitem>
-        <para>List Data</para>
-      </listitem>
-    </itemizedlist>
-    <section
-      xml:id="schema.casestudies.log-timeseries">
-      <title>Case Study - Log Data and Timeseries Data</title>
-      <para>Assume that the following data elements are being collected. </para>
-      <itemizedlist>
-        <listitem>
-          <para>Hostname</para>
-        </listitem>
-        <listitem>
-          <para>Timestamp</para>
-        </listitem>
-        <listitem>
-          <para>Log event</para>
-        </listitem>
-        <listitem>
-          <para>Value/message</para>
-        </listitem>
-      </itemizedlist>
-      <para>We can store them in an HBase table called LOG_DATA, but what will the rowkey be? From
-        these attributes the rowkey will be some combination of hostname, timestamp, and log-event -
-        but what specifically? </para>
-      <section
-        xml:id="schema.casestudies.log-timeseries.tslead">
-        <title>Timestamp In The Rowkey Lead Position</title>
-        <para>The rowkey <code>[timestamp][hostname][log-event]</code> suffers from the
-          monotonically increasing rowkey problem described in <xref
-            linkend="timeseries" />. </para>
-        <para>There is another pattern frequently mentioned in the dist-lists about “bucketing”
-          timestamps, by performing a mod operation on the timestamp. If time-oriented scans are
-          important, this could be a useful approach. Attention must be paid to the number of
-          buckets, because this will require the same number of scans to return results.</para>
-        <programlisting language="java">
-long bucket = timestamp % numBuckets;
-        </programlisting>
-        <para>… to construct:</para>
-        <programlisting>
-[bucket][timestamp][hostname][log-event]
-        </programlisting>
-        <para>As stated above, to select data for a particular timerange, a Scan will need to be
-          performed for each bucket. 100 buckets, for example, will provide a wide distribution in
-          the keyspace but it will require 100 Scans to obtain data for a single timestamp, so there
-          are trade-offs. </para>
-      </section>
-      <!-- ts lead -->
-      <section
-        xml:id="schema.casestudies.log-timeseries.hostlead">
-        <title>Host In The Rowkey Lead Position</title>
-        <para>The rowkey <code>[hostname][log-event][timestamp]</code> is a candidate if there is a
-          large-ish number of hosts to spread the writes and reads across the keyspace. This
-          approach would be useful if scanning by hostname was a priority. </para>
-      </section>
-      <!--  host lead -->
-      <section
-        xml:id="schema.casestudies.log-timeseries.revts">
-        <title>Timestamp, or Reverse Timestamp?</title>
-        <para>If the most important access path is to pull most recent events, then storing the
-          timestamps as reverse-timestamps (e.g., <code>timestamp = Long.MAX_VALUE –
-            timestamp</code>) will create the property of being able to do a Scan on
-            <code>[hostname][log-event]</code> to obtain the quickly obtain the most recently
-          captured events. </para>
-        <para>Neither approach is wrong, it just depends on what is most appropriate for the
-          situation. </para>
-        <note>
-          <title>Reverse Scan API</title>
-          <para>
-            <link
-              xlink:href="https://issues.apache.org/jira/browse/HBASE-4811">HBASE-4811</link>
-            implements an API to scan a table or a range within a table in reverse, reducing the
-            need to optimize your schema for forward or reverse scanning. This feature is available
-            in HBase 0.98 and later. See <link
-              xlink:href="https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setReversed%28boolean" />
-            for more information. </para>
-        </note>
-      </section>
-      <!--  revts -->
-      <section
-        xml:id="schema.casestudies.log-timeseries.varkeys">
-        <title>Variangle Length or Fixed Length Rowkeys?</title>
-        <para>It is critical to remember that rowkeys are stamped on every column in HBase. If the
-          hostname is “a” and the event type is “e1” then the resulting rowkey would be quite small.
-          However, what if the ingested hostname is “myserver1.mycompany.com” and the event type is
-          “com.package1.subpackage2.subsubpackage3.ImportantService”? </para>
-        <para>It might make sense to use some substitution in the rowkey. There are at least two
-          approaches: hashed and numeric. In the Hostname In The Rowkey Lead Position example, it
-          might look like this: </para>
-        <para>Composite Rowkey With Hashes:</para>
-        <itemizedlist>
-          <listitem>
-            <para>[MD5 hash of hostname] = 16 bytes</para>
-          </listitem>
-          <listitem>
-            <para>[MD5 hash of event-type] = 16 bytes</para>
-          </listitem>
-          <listitem>
-            <para>[timestamp] = 8 bytes</para>
-          </listitem>
-        </itemizedlist>
-        <para>Composite Rowkey With Numeric Substitution: </para>
-        <para>For this approach another lookup table would be needed in addition to LOG_DATA, called
-          LOG_TYPES. The rowkey of LOG_TYPES would be: </para>
-        <itemizedlist>
-          <listitem>
-            <para>[type] (e.g., byte indicating hostname vs. event-type)</para>
-          </listitem>
-          <listitem>
-            <para>[bytes] variable length bytes for raw hostname or event-type.</para>
-          </listitem>
-        </itemizedlist>
-        <para>A column for this rowkey could be a long with an assigned number, which could be
-          obtained by using an <link
-            xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#incrementColumnValue%28byte[],%20byte[],%20byte[],%20long%29">HBase
-            counter</link>. </para>
-        <para>So the resulting composite rowkey would be: </para>
-        <itemizedlist>
-          <listitem>
-            <para>[substituted long for hostname] = 8 bytes</para>
-          </listitem>
-          <listitem>
-            <para>[substituted long for event type] = 8 bytes</para>
-          </listitem>
-          <listitem>
-            <para>[timestamp] = 8 bytes</para>
-          </listitem>
-        </itemizedlist>
-        <para>In either the Hash or Numeric substitution approach, the raw values for hostname and
-          event-type can be stored as columns. </para>
-      </section>
-      <!--  varkeys -->
-    </section>
-    <!--  log data and timeseries -->
-    <section
-      xml:id="schema.casestudies.log-steroids">
-      <title>Case Study - Log Data and Timeseries Data on Steroids</title>
-      <para>This effectively is the OpenTSDB approach. What OpenTSDB does is re-write data and pack
-        rows into columns for certain time-periods. For a detailed explanation, see: <link
-          xlink:href="http://opentsdb.net/schema.html">http://opentsdb.net/schema.html</link>, and <link
-          xlink:href="http://www.cloudera.com/content/cloudera/en/resources/library/hbasecon/video-hbasecon-2012-lessons-learned-from-opentsdb.html">Lessons
-          Learned from OpenTSDB</link> from HBaseCon2012. </para>
-      <para>But this is how the general concept works: data is ingested, for example, in this
-        manner…</para>
-      <screen>
-[hostname][log-event][timestamp1]
-[hostname][log-event][timestamp2]
-[hostname][log-event][timestamp3]
-        </screen>
-      <para>… with separate rowkeys for each detailed event, but is re-written like this… </para>
-      <screen>[hostname][log-event][timerange]</screen>
-      <para>… and each of the above events are converted into columns stored with a time-offset
-        relative to the beginning timerange (e.g., every 5 minutes). This is obviously a very
-        advanced processing technique, but HBase makes this possible. </para>
-    </section>
-    <!--  log data timeseries steroids -->
-
-    <section
-      xml:id="schema.casestudies.custorder">
-      <title>Case Study - Customer/Order</title>
-      <para>Assume that HBase is used to store customer and order information. There are two core
-        record-types being ingested: a Customer record type, and Order record type. </para>
-      <para>The Customer record type would include all the things that you’d typically expect: </para>
-      <itemizedlist>
-        <listitem>
-          <para>Customer number</para>
-        </listitem>
-        <listitem>
-          <para>Customer name</para>
-        </listitem>
-        <listitem>
-          <para>Address (e.g., city, state, zip)</para>
-        </listitem>
-        <listitem>
-          <para>Phone numbers, etc.</para>
-        </listitem>
-      </itemizedlist>
-      <para>The Order record type would include things like: </para>
-      <itemizedlist>
-        <listitem>
-          <para>Customer number</para>
-        </listitem>
-        <listitem>
-          <para>Order number</para>
-        </listitem>
-        <listitem>
-          <para>Sales date</para>
-        </listitem>
-        <listitem>
-          <para>A series of nested objects for shipping locations and line-items (see <xref
-              linkend="schema.casestudies.custorder.obj" /> for details)</para>
-        </listitem>
-      </itemizedlist>
-      <para>Assuming that the combination of customer number and sales order uniquely identify an
-        order, these two attributes will compose the rowkey, and specifically a composite key such
-        as: </para>
-      <screen>[customer number][order number]</screen>
-      <para>… for a ORDER table. However, there are more design decisions to make: are the
-          <emphasis>raw</emphasis> values the best choices for rowkeys? </para>
-      <para>The same design questions in the Log Data use-case confront us here. What is the
-        keyspace of the customer number, and what is the format (e.g., numeric? alphanumeric?) As it
-        is advantageous to use fixed-length keys in HBase, as well as keys that can support a
-        reasonable spread in the keyspace, similar options appear: </para>
-      <para>Composite Rowkey With Hashes: </para>
-      <itemizedlist>
-        <listitem>
-          <para>[MD5 of customer number] = 16 bytes</para>
-        </listitem>
-        <listitem>
-          <para>[MD5 of order number] = 16 bytes</para>
-        </listitem>
-      </itemizedlist>
-      <para>Composite Numeric/Hash Combo Rowkey: </para>
-      <itemizedlist>
-        <listitem>
-          <para>[substituted long for customer number] = 8 bytes</para>
-        </listitem>
-        <listitem>
-          <para>[MD5 of order number] = 16 bytes</para>
-        </listitem>
-      </itemizedlist>
-      <section
-        xml:id="schema.casestudies.custorder.tables">
-        <title>Single Table? Multiple Tables?</title>
-        <para>A traditional design approach would have separate tables for CUSTOMER and SALES.
-          Another option is to pack multiple record types into a single table (e.g., CUSTOMER++). </para>
-        <para>Customer Record Type Rowkey: </para>
-        <itemizedlist>
-          <listitem>
-            <para>[customer-id]</para>
-          </listitem>
-          <listitem>
-            <para>[type] = type indicating ‘1’ for customer record type</para>
-          </listitem>
-        </itemizedlist>
-        <para>Order Record Type Rowkey: </para>
-        <itemizedlist>
-          <listitem>
-            <para>[customer-id]</para>
-          </listitem>
-          <listitem>
-            <para>[type] = type indicating ‘2’ for order record type</para>
-          </listitem>
-          <listitem>
-            <para>[order]</para>
-          </listitem>
-        </itemizedlist>
-        <para>The advantage of this particular CUSTOMER++ approach is that organizes many different
-          record-types by customer-id (e.g., a single scan could get you everything about that
-          customer). The disadvantage is that it’s not as easy to scan for a particular record-type.
-        </para>
-      </section>
-      <section
-        xml:id="schema.casestudies.custorder.obj">
-        <title>Order Object Design</title>
-        <para>Now we need to address how to model the Order object. Assume that the class structure
-          is as follows:</para>
-        <variablelist>
-          <varlistentry>
-            <term>Order</term>
-            <listitem>
-              <para>(an Order can have multiple ShippingLocations</para>
-            </listitem>
-          </varlistentry>
-          <varlistentry>
-            <term>LineItem</term>
-            <listitem>
-              <para>(a ShippingLocation can have multiple LineItems</para>
-            </listitem>
-          </varlistentry>
-        </variablelist>
-        <para>... there are multiple options on storing this data. </para>
-        <section
-          xml:id="schema.casestudies.custorder.obj.norm">
-          <title>Completely Normalized</title>
-          <para>With this approach, there would be separate tables for ORDER, SHIPPING_LOCATION, and
-            LINE_ITEM. </para>
-          <para>The ORDER table's rowkey was described above: <xref
-              linkend="schema.casestudies.custorder" />
-          </para>
-          <para>The SHIPPING_LOCATION's composite rowkey would be something like this: </para>
-          <itemizedlist>
-            <listitem>
-              <para>[order-rowkey]</para>
-            </listitem>
-            <listitem>
-              <para>[shipping location number] (e.g., 1st location, 2nd, etc.)</para>
-            </listitem>
-          </itemizedlist>
-          <para>The LINE_ITEM table's composite rowkey would be something like this: </para>
-          <itemizedlist>
-            <listitem>
-              <para>[order-rowkey]</para>
-            </listitem>
-            <listitem>
-              <para>[shipping location number] (e.g., 1st location, 2nd, etc.)</para>
-            </listitem>
-            <listitem>
-              <para>[line item number] (e.g., 1st lineitem, 2nd, etc.)</para>
-            </listitem>
-          </itemizedlist>
-          <para>Such a normalized model is likely to be the approach with an RDBMS, but that's not
-            your only option with HBase. The cons of such an approach is that to retrieve
-            information about any Order, you will need: </para>
-          <itemizedlist>
-            <listitem>
-              <para>Get on the ORDER table for the Order</para>
-            </listitem>
-            <listitem>
-              <para>Scan on the SHIPPING_LOCATION table for that order to get the ShippingLocation
-                instances</para>
-            </listitem>
-            <listitem>
-              <para>Scan on the LINE_ITEM for each ShippingLocation</para>
-            </listitem>
-          </itemizedlist>
-          <para>... granted, this is what an RDBMS would do under the covers anyway, but since there
-            are no joins in HBase you're just more aware of this fact. </para>
-        </section>
-        <section
-          xml:id="schema.casestudies.custorder.obj.rectype">
-          <title>Single Table With Record Types</title>
-          <para>With this approach, there would exist a single table ORDER that would contain </para>
-          <para>The Order rowkey was described above: <xref
-              linkend="schema.casestudies.custorder" /></para>
-          <itemizedlist>
-            <listitem>
-              <para>[order-rowkey]</para>
-            </listitem>
-            <listitem>
-              <para>[ORDER record type]</para>
-            </listitem>
-          </itemizedlist>
-          <para>The ShippingLocation composite rowkey would be something like this: </para>
-          <itemizedlist>
-            <listitem>
-              <para>[order-rowkey]</para>
-            </listitem>
-            <listitem>
-              <para>[SHIPPING record type]</para>
-            </listitem>
-            <listitem>
-              <para>[shipping location number] (e.g., 1st location, 2nd, etc.)</para>
-            </listitem>
-          </itemizedlist>
-          <para>The LineItem composite rowkey would be something like this: </para>
-          <itemizedlist>
-            <listitem>
-              <para>[order-rowkey]</para>
-            </listitem>
-            <listitem>
-              <para>[LINE record type]</para>
-            </listitem>
-            <listitem>
-              <para>[shipping location number] (e.g., 1st location, 2nd, etc.)</para>
-            </listitem>
-            <listitem>
-              <para>[line item number] (e.g., 1st lineitem, 2nd, etc.)</para>
-            </listitem>
-          </itemizedlist>
-        </section>
-        <section
-          xml:id="schema.casestudies.custorder.obj.denorm">
-          <title>Denormalized</title>
-          <para>A variant of the Single Table With Record Types approach is to denormalize and
-            flatten some of the object hierarchy, such as collapsing the ShippingLocation attributes
-            onto each LineItem instance. </para>
-          <para>The LineItem composite rowkey would be something like this: </para>
-          <itemizedlist>
-            <listitem>
-              <para>[order-rowkey]</para>
-            </listitem>
-            <listitem>
-              <para>[LINE record type]</para>
-            </listitem>
-            <listitem>
-              <para>[line item number] (e.g., 1st lineitem, 2nd, etc. - care must be taken that
-                there are unique across the entire order)</para>
-            </listitem>
-          </itemizedlist>
-          <para>... and the LineItem columns would be something like this: </para>
-          <itemizedlist>
-            <listitem>
-              <para>itemNumber</para>
-            </listitem>
-            <listitem>
-              <para>quantity</para>
-            </listitem>
-            <listitem>
-              <para>price</para>
-            </listitem>
-            <listitem>
-              <para>shipToLine1 (denormalized from ShippingLocation)</para>
-            </listitem>
-            <listitem>
-              <para>shipToLine2 (denormalized from ShippingLocation)</para>
-            </listitem>
-            <listitem>
-              <para>shipToCity (denormalized from ShippingLocation)</para>
-            </listitem>
-            <listitem>
-              <para>shipToState (denormalized from ShippingLocation)</para>
-            </listitem>
-            <listitem>
-              <para>shipToZip (denormalized from ShippingLocation)</para>
-            </listitem>
-          </itemizedlist>
-          <para>The pros of this approach include a less complex object heirarchy, but one of the
-            cons is that updating gets more complicated in case any of this information changes.
-          </para>
-        </section>
-        <section
-          xml:id="schema.casestudies.custorder.obj.singleobj">
-          <title>Object BLOB</title>
-          <para>With this approach, the entire Order object graph is treated, in one way or another,
-            as a BLOB. For example, the ORDER table's rowkey was described above: <xref
-              linkend="schema.casestudies.custorder" />, and a single column called "order" would
-            contain an object that could be deserialized that contained a container Order,
-            ShippingLocations, and LineItems. </para>
-          <para>There are many options here: JSON, XML, Java Serialization, Avro, Hadoop Writables,
-            etc. All of them are variants of the same approach: encode the object graph to a
-            byte-array. Care should be taken with this approach to ensure backward compatibilty in
-            case the object model changes such that older persisted structures can still be read
-            back out of HBase. </para>
-          <para>Pros are being able to manage complex object graphs with minimal I/O (e.g., a single
-            HBase Get per Order in this example), but the cons include the aforementioned warning
-            about backward compatiblity of serialization, language dependencies of serialization
-            (e.g., Java Serialization only works with Java clients), the fact that you have to
-            deserialize the entire object to get any piece of information inside the BLOB, and the
-            difficulty in getting frameworks like Hive to work with custom objects like this.
-          </para>
-        </section>
-      </section>
-      <!--  cust/order order object -->
-    </section>
-    <!--  cust/order -->
-
-    <section
-      xml:id="schema.smackdown">
-      <title>Case Study - "Tall/Wide/Middle" Schema Design Smackdown</title>
-      <para>This section will describe additional schema design questions that appear on the
-        dist-list, specifically about tall and wide tables. These are general guidelines and not
-        laws - each application must consider its own needs. </para>
-      <section
-        xml:id="schema.smackdown.rowsversions">
-        <title>Rows vs. Versions</title>
-        <para>A common question is whether one should prefer rows or HBase's built-in-versioning.
-          The context is typically where there are "a lot" of versions of a row to be retained
-          (e.g., where it is significantly above the HBase default of 1 max versions). The
-          rows-approach would require storing a timestamp in some portion of the rowkey so that they
-          would not overwite with each successive update. </para>
-        <para>Preference: Rows (generally speaking). </para>
-      </section>
-      <section
-        xml:id="schema.smackdown.rowscols">
-        <title>Rows vs. Columns</title>
-        <para>Another common question is whether one should prefer rows or columns. The context is
-          typically in extreme cases of wide tables, such as having 1 row with 1 million attributes,
-          or 1 million rows with 1 columns apiece. </para>
-        <para>Preference: Rows (generally speaking). To be clear, this guideline is in the context
-          is in extremely wide cases, not in the standard use-case where one needs to store a few
-          dozen or hundred columns. But there is also a middle path between these two options, and
-          that is "Rows as Columns." </para>
-      </section>
-      <section
-        xml:id="schema.smackdown.rowsascols">
-        <title>Rows as Columns</title>
-        <para>The middle path between Rows vs. Columns is packing data that would be a separate row
-          into columns, for certain rows. OpenTSDB is the best example of this case where a single
-          row represents a defined time-range, and then discrete events are treated as columns. This
-          approach is often more complex, and may require the additional complexity of re-writing
-          your data, but has the advantage of being I/O efficient. For an overview of this approach,
-          see <xref
-            linkend="schema.casestudies.log-steroids" />. </para>
-      </section>
-    </section>
-    <!--  note:  the following id is not consistent with the others becaus it was formerly in the Case Studies chapter,
-	    but I didn't want to break backward compatibility of the link.  But future entries should look like the above case-study
-	    links (schema.casestudies. ...)  -->
-    <section
-      xml:id="casestudies.schema.listdata">
-      <title>Case Study - List Data</title>
-      <para>The following is an exchange from the user dist-list regarding a fairly common question:
-        how to handle per-user list data in Apache HBase. </para>
-      <para>*** QUESTION ***</para>
-      <para> We're looking at how to store a large amount of (per-user) list data in HBase, and we
-        were trying to figure out what kind of access pattern made the most sense. One option is
-        store the majority of the data in a key, so we could have something like: </para>
-
-      <programlisting><![CDATA[
-<FixedWidthUserName><FixedWidthValueId1>:"" (no value)
-<FixedWidthUserName><FixedWidthValueId2>:"" (no value)
-<FixedWidthUserName><FixedWidthValueId3>:"" (no value)
-]]></programlisting>
-
-      <para>The other option we had was to do this entirely using:</para>
-      <programlisting language="xml"><![CDATA[
-<FixedWidthUserName><FixedWidthPageNum0>:<FixedWidthLength><FixedIdNextPageNum><ValueId1><ValueId2><ValueId3>...
-<FixedWidthUserName><FixedWidthPageNum1>:<FixedWidthLength><FixedIdNextPageNum><ValueId1><ValueId2><ValueId3>...
-    		]]></programlisting>
-      <para> where each row would contain multiple values. So in one case reading the first thirty
-        values would be: </para>
-      <programlisting language="java"><![CDATA[
-scan { STARTROW => 'FixedWidthUsername' LIMIT => 30}
-    		]]></programlisting>
-      <para>And in the second case it would be </para>
-      <programlisting>
-get 'FixedWidthUserName\x00\x00\x00\x00'
-    		</programlisting>
-      <para> The general usage pattern would be to read only the first 30 values of these lists,
-        with infrequent access reading deeper into the lists. Some users would have &lt;= 30 total
-        values in these lists, and some users would have millions (i.e. power-law distribution) </para>
-      <para> The single-value format seems like it would take up more space on HBase, but would
-        offer some improved retrieval / pagination flexibility. Would there be any significant
-        performance advantages to be able to paginate via gets vs paginating with scans? </para>
-      <para> My initial understanding was that doing a scan should be faster if our paging size is
-        unknown (and caching is set appropriately), but that gets should be faster if we'll always
-        need the same page size. I've ended up hearing different people tell me opposite things
-        about performance. I assume the page sizes would be relatively consistent, so for most use
-        cases we could guarantee that we only wanted one page of data in the fixed-page-length case.
-        I would also assume that we would have infrequent updates, but may have inserts into the
-        middle of these lists (meaning we'd need to update all subsequent rows). </para>
-      <para> Thanks for help / suggestions / follow-up questions. </para>
-      <para>*** ANSWER ***</para>
-      <para> If I understand you correctly, you're ultimately trying to store triples in the form
-        "user, valueid, value", right? E.g., something like: </para>
-      <programlisting>
-"user123, firstname, Paul",
-"user234, lastname, Smith"
-			</programlisting>
-      <para> (But the usernames are fixed width, and the valueids are fixed width). </para>
-      <para> And, your access pattern is along the lines of: "for user X, list the next 30 values,
-        starting with valueid Y". Is that right? And these values should be returned sorted by
-        valueid? </para>
-      <para> The tl;dr version is that you should probably go with one row per user+value, and not
-        build a complicated intra-row pagination scheme on your own unless you're really sure it is
-        needed. </para>
-      <para> Your two options mirror a common question people have when designing HBase schemas:
-        should I go "tall" or "wide"? Your first schema is "tall": each row represents one value for
-        one user, and so there are many rows in the table for each user; the row key is user +
-        valueid, and there would be (presumably) a single column qualifier that means "the value".
-        This is great if you want to scan over rows in sorted order by row key (thus my question
-        above, about whether these ids are sorted correctly). You can start a scan at any
-        user+valueid, read the next 30, and be done. What you're giving up is the ability to have
-        transactional guarantees around all the rows for one user, but it doesn't sound like you
-        need that. Doing it this way is generally recommended (see here <link
-          xlink:href="http://hbase.apache.org/book.html#schema.smackdown">http://hbase.apache.org/book.html#schema.smackdown</link>). </para>
-      <para> Your second option is "wide": you store a bunch of values in one row, using different
-        qualifiers (where the qualifier is the valueid). The simple way to do that would be to just
-        store ALL values for one user in a single row. I'm guessing you jumped to the "paginated"
-        version because you're assuming that storing millions of columns in a single row would be
-        bad for performance, which may or may not be true; as long as you're not trying to do too
-        much in a single request, or do things like scanning over and returning all of the cells in
-        the row, it shouldn't be fundamentally worse. The client has methods that allow you to get
-        specific slices of columns. </para>
-      <para> Note that neither case fundamentally uses more disk space than the other; you're just
-        "shifting" part of the identifying information for a value either to the left (into the row
-        key, in option one) or to the right (into the column qualifiers in option 2). Under the
-        covers, every key/value still stores the whole row key, and column family name. (If this is
-        a bit confusing, take an hour and watch Lars George's excellent video about understanding
-        HBase schema design: <link
-          xlink:href="http://www.youtube.com/watch?v=_HLoH_PgrLk)">http://www.youtube.com/watch?v=_HLoH_PgrLk)</link>. </para>
-      <para> A manually paginated version has lots more complexities, as you note, like having to
-        keep track of how many things are in each page, re-shuffling if new values are inserted,
-        etc. That seems significantly more complex. It might have some slight speed advantages (or
-        disadvantages!) at extremely high throughput, and the only way to really know that would be
-        to try it out. If you don't have time to build it both ways and compare, my advice would be
-        to start with the simplest option (one row per user+value). Start simple and iterate! :)
-      </para>
-    </section>
-    <!--  listdata -->
-
-  </section>
-  <!--  schema design cases -->
-  <section
-    xml:id="schema.ops">
-    <title>Operational and Performance Configuration Options</title>
-    <para>See the Performance section <xref
-        linkend="perf.schema" /> for more information operational and performance schema design
-      options, such as Bloom Filters, Table-configured regionsizes, compression, and blocksizes.
-    </para>
-  </section>
-
-</chapter>
-<!--  schema design -->


Mime
View raw message