hbase-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dm...@apache.org
Subject svn commit: r1415422 - in /hbase/trunk/src/docbkx: book.xml performance.xml
Date Thu, 29 Nov 2012 22:46:02 GMT
Author: dmeil
Date: Thu Nov 29 22:46:02 2012
New Revision: 1415422

URL: http://svn.apache.org/viewvc?rev=1415422&view=rev
Log:
hbase-7241.  refGuide.  Perf/Schema design cleanup.

Modified:
    hbase/trunk/src/docbkx/book.xml
    hbase/trunk/src/docbkx/performance.xml

Modified: hbase/trunk/src/docbkx/book.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/book.xml?rev=1415422&r1=1415421&r2=1415422&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/book.xml (original)
+++ hbase/trunk/src/docbkx/book.xml Thu Nov 29 22:46:02 2012
@@ -775,6 +775,34 @@ System.out.println("md5 digest as string
     <para>Lesson #2:  While generally not advisable, using hex-keys (and more generally,
displayable data) can still work with pre-split
     tables as long as all the created regions are accessible in the keyspace.
     </para>
+	 <para>To conclude this example, the following is an example of  how appropriate splits
can be pre-created for hex-keys:.
+	    </para>
+<programlisting>public static boolean createTable(HBaseAdmin admin, HTableDescriptor
table, byte[][] splits)
+throws IOException {
+  try {
+    admin.createTable( table, splits );
+    return true;
+  } catch (TableExistsException e) {
+    logger.info("table " + table.getNameAsString() + " already exists");
+    // the table already exists...
+    return false;
+  }
+}
+
+public static byte[][] getHexSplits(String startKey, String endKey, int numRegions) {
+  byte[][] splits = new byte[numRegions-1][];
+  BigInteger lowestKey = new BigInteger(startKey, 16);
+  BigInteger highestKey = new BigInteger(endKey, 16);
+  BigInteger range = highestKey.subtract(lowestKey);
+  BigInteger regionIncrement = range.divide(BigInteger.valueOf(numRegions));
+  lowestKey = lowestKey.add(regionIncrement);
+  for(int i=0; i &lt; numRegions-1;i++) {
+    BigInteger key = lowestKey.add(regionIncrement.multiply(BigInteger.valueOf(i)));
+    byte[] b = String.format("%016x", key).getBytes();
+    splits[i] = b;
+  }
+  return splits;
+}</programlisting>
     </section>
     </section>  <!--  rowkey design -->
     <section xml:id="schema.versions">

Modified: hbase/trunk/src/docbkx/performance.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/performance.xml?rev=1415422&r1=1415421&r2=1415422&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/performance.xml (original)
+++ hbase/trunk/src/docbkx/performance.xml Thu Nov 29 22:46:02 2012
@@ -303,35 +303,27 @@
     Table Creation: Pre-Creating Regions
     </title>
 <para>
-Tables in HBase are initially created with one region by default.  For bulk imports, this
means that all clients will write to the same region until it is large enough to split and
become distributed across the cluster.  A useful pattern to speed up the bulk import process
is to pre-create empty regions.  Be somewhat conservative in this, because too-many regions
can actually degrade performance.  An example of pre-creation using hex-keys is as follows
(note:  this example may need to be tweaked to the individual applications keys):
+Tables in HBase are initially created with one region by default.  For bulk imports, this
means that all clients will write to the same region 
+until it is large enough to split and become distributed across the cluster.  A useful pattern
to speed up the bulk import process is to pre-create empty regions. 
+ Be somewhat conservative in this, because too-many regions can actually degrade performance.
 
 </para>
+	<para>There are two different approaches to pre-creating splits.  The first approach
is to rely on the default <code>HBaseAdmin</code> strategy 
+	(which is implemented in <code>Bytes.split</code>)...
+	</para>
+<programlisting>
+byte[] startKey = ...;   	// your lowest keuy
+byte[] endKey = ...;   		// your highest key
+int numberOfRegions = ...;	// # of regions to create
+admin.createTable(table, startKey, endKey, numberOfRegions);
+</programlisting>
+	<para>And the other approach is to define the splits yourself...
+	</para>
+<programlisting>
+byte[][] splits = ...;   // create your own splits
+admin.createTable(table, splits);
+</programlisting>
 <para>
-<programlisting>public static boolean createTable(HBaseAdmin admin, HTableDescriptor
table, byte[][] splits)
-throws IOException {
-  try {
-    admin.createTable( table, splits );
-    return true;
-  } catch (TableExistsException e) {
-    logger.info("table " + table.getNameAsString() + " already exists");
-    // the table already exists...
-    return false;
-  }
-}
-
-public static byte[][] getHexSplits(String startKey, String endKey, int numRegions) {
-  byte[][] splits = new byte[numRegions-1][];
-  BigInteger lowestKey = new BigInteger(startKey, 16);
-  BigInteger highestKey = new BigInteger(endKey, 16);
-  BigInteger range = highestKey.subtract(lowestKey);
-  BigInteger regionIncrement = range.divide(BigInteger.valueOf(numRegions));
-  lowestKey = lowestKey.add(regionIncrement);
-  for(int i=0; i &lt; numRegions-1;i++) {
-    BigInteger key = lowestKey.add(regionIncrement.multiply(BigInteger.valueOf(i)));
-    byte[] b = String.format("%016x", key).getBytes();
-    splits[i] = b;
-  }
-  return splits;
-}</programlisting>
+   See <xref linkend="rowkey.regionsplits"/> for issues related to understanding your
keyspace and pre-creating regions.
   </para>
   </section>
     <section xml:id="def.log.flush">



Mime
View raw message