gora-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lewi...@apache.org
Subject svn commit: r1567702 - /gora/site/trunk/content/current/gora-hbase.md
Date Wed, 12 Feb 2014 17:59:31 GMT
Author: lewismc
Date: Wed Feb 12 17:59:30 2014
New Revision: 1567702

URL: http://svn.apache.org/r1567702
Log:
CMS commit to gora by lewismc

Modified:
    gora/site/trunk/content/current/gora-hbase.md

Modified: gora/site/trunk/content/current/gora-hbase.md
URL: http://svn.apache.org/viewvc/gora/site/trunk/content/current/gora-hbase.md?rev=1567702&r1=1567701&r2=1567702&view=diff
==============================================================================
--- gora/site/trunk/content/current/gora-hbase.md (original)
+++ gora/site/trunk/content/current/gora-hbase.md Wed Feb 12 17:59:30 2014
@@ -29,24 +29,24 @@ Say we wished to map some Employee data 
 
 Here you can see that we require the definition of two child elements within the 
 <code>gora-orm</code> mapping configuration, namely;
+
 1. The table element; where we specify: 
-    the HBase table name e.g. <b>Employee</b>, 
-    the type and definition of families we wish to create within HBase. 
-In this case we create one family which could have a combination of any of the following
characteristics;
-    <b>name</b>: family name e.g. info
-    <b>compression</b>: the compression option to use in HBase. Please see <a
href="http://hbase.apache.org/book/compression.html">HBase documentation</a>.
-    <b>blockCache</b>:  an LRU cache that contains three levels of block priority
to allow for scan-resistance and in-memory ColumnFamilies. Please see <a href="https://hbase.apache.org/book/regionserver.arch.html#block.cache">HBase
documentation</a>.
-    <b>blockSize</b>: The blocksize can be configured for each ColumnFamily in
a table, and this defaults to 64k. Larger cell values require larger blocksizes. There is
an inverse relationship between blocksize and the resulting StoreFile indexes (i.e., if the
blocksize is doubled then the resulting indexes should be roughly halved). Please see <a
href="http://hbase.apache.org/book/perf.schema.html#schema.cf.blocksize">HBase documentation</a>.

-    <b>bloomFilter</b>: Bloom Filters can be enabled per-ColumnFamily. We use
<code>HColumnDescriptor.setBloomFilterType(NONE | ROW | ROWCOL)</code> to enable
blooms per Column Family. Default = NONE for no bloom filters. If ROW, the hash of the row
will be added to the bloom on each insert. If ROWCOL, the hash of the row + column family
name + column family qualifier will be added to the bloom on each key insert. Please see <a
href="http://hbase.apache.org/book/perf.schema.html#schema.bloom">HBase documentation</a>.
-    <b>maxVersions</b>: The maximum number of row versions to store is configured
per column family via <code>HColumnDescriptor</code>. The default for max versions
is <b>3</b>. This is an important parameter because HBase does not overwrite row
values, but rather stores different values per row by time (and qualifier). Excess versions
are removed during major compaction's. The number of max versions may need to be increased
or decreased depending on application needs. Please see <a href="http://hbase.apache.org/book/schema.versions.html">HBase
documentation</a>.
-    <b>timeToLive</b>: ColumnFamilies can set a TTL length in seconds, and HBase
will automatically delete rows once the expiration time is reached. This applies to all versions
of a row - even the current one. The TTL time encoded in the HBase for the row is specified
in UTC. Please see <a href="https://hbase.apache.org/book/ttl.html">HBase documentation</a>.
-    <b>inMemory</b>: ColumnFamilies can optionally be defined as in-memory. Data
is still persisted to disk, just like any other ColumnFamily. In-memory blocks have the highest
priority in the Block Cache, but it is not a guarantee that the entire table will be in memory.
Please see <a href="http://hbase.apache.org/book/perf.schema.html#cf.in.memory">HBase
documentation</a>.
-2. Specification of persistent fields which values should map to; 
-    the Persistent class name e.g. <b>org.apache.gora.examples.generated.Employee</b>,

-    the keyClass e.g. <b>java.lang.String</b> which specifies the keys which
map to the field 
+  * the HBase table name e.g. <b>Employee</b>, 
+  * the type and definition of families we wish to create within HBase. In this case we create
one family which could have a combination of any of the following characteristics;
+  1. <b>name</b>: family name e.g. info
+  2. <b>compression</b>: the compression option to use in HBase. Please see <a
href="http://hbase.apache.org/book/compression.html">HBase documentation</a>.
+  3. <b>blockCache</b>:  an LRU cache that contains three levels of block priority
to allow for scan-resistance and in-memory ColumnFamilies. Please see <a href="https://hbase.apache.org/book/regionserver.arch.html#block.cache">HBase
documentation</a>.
+  4. <b>blockSize</b>: The blocksize can be configured for each ColumnFamily
in a table, and this defaults to 64k. Larger cell values require larger blocksizes. There
is an inverse relationship between blocksize and the resulting StoreFile indexes (i.e., if
the blocksize is doubled then the resulting indexes should be roughly halved). Please see
<a href="http://hbase.apache.org/book/perf.schema.html#schema.cf.blocksize">HBase documentation</a>.

+  5. <b>bloomFilter</b>: Bloom Filters can be enabled per-ColumnFamily. We use
<code>HColumnDescriptor.setBloomFilterType(NONE | ROW | ROWCOL)</code> to enable
blooms per Column Family. Default = NONE for no bloom filters. If ROW, the hash of the row
will be added to the bloom on each insert. If ROWCOL, the hash of the row + column family
name + column family qualifier will be added to the bloom on each key insert. Please see <a
href="http://hbase.apache.org/book/perf.schema.html#schema.bloom">HBase documentation</a>.
+  6. <b>maxVersions</b>: The maximum number of row versions to store is configured
per column family via <code>HColumnDescriptor</code>. The default for max versions
is <b>3</b>. This is an important parameter because HBase does not overwrite row
values, but rather stores different values per row by time (and qualifier). Excess versions
are removed during major compaction's. The number of max versions may need to be increased
or decreased depending on application needs. Please see <a href="http://hbase.apache.org/book/schema.versions.html">HBase
documentation</a>.
+  7. <b>timeToLive</b>: ColumnFamilies can set a TTL length in seconds, and HBase
will automatically delete rows once the expiration time is reached. This applies to all versions
of a row - even the current one. The TTL time encoded in the HBase for the row is specified
in UTC. Please see <a href="https://hbase.apache.org/book/ttl.html">HBase documentation</a>.
+  8. <b>inMemory</b>: ColumnFamilies can optionally be defined as in-memory.
Data is still persisted to disk, just like any other ColumnFamily. In-memory blocks have the
highest priority in the Block Cache, but it is not a guarantee that the entire table will
be in memory. Please see <a href="http://hbase.apache.org/book/perf.schema.html#cf.in.memory">HBase
documentation</a>.
+2. Specification of persistent fields which values should map to;
+  * the Persistent class name e.g. <b>org.apache.gora.examples.generated.Employee</b>,

+  * the keyClass e.g. <b>java.lang.String</b> which specifies the keys which
map to the field 
 values, 
-    the Table e.g. <b>Employee</b> which matches to the above Table definition,
-    finally fields which are to be persisted into HBase need to be configured such that they

+  * the Table e.g. <b>Employee</b> which matches to the above Table definition,
+  * finally fields which are to be persisted into HBase need to be configured such that they

 receive a <b>name</b> e.g. (name, dateOfBirth, ssn and salary respectively),
the column <b>family</b> 
 to which they belong e.g. (all info in this case) and an additional <b>qualifier</b>,
which enables 
 more granular control over the data to be persisted into HBase.



Mime
View raw message