Return-Path: X-Original-To: apmail-hbase-commits-archive@www.apache.org Delivered-To: apmail-hbase-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CBCE39578 for ; Tue, 15 Nov 2011 01:14:32 +0000 (UTC) Received: (qmail 57486 invoked by uid 500); 15 Nov 2011 01:14:32 -0000 Delivered-To: apmail-hbase-commits-archive@hbase.apache.org Received: (qmail 57467 invoked by uid 500); 15 Nov 2011 01:14:32 -0000 Mailing-List: contact commits-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list commits@hbase.apache.org Received: (qmail 57460 invoked by uid 99); 15 Nov 2011 01:14:32 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Nov 2011 01:14:32 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Nov 2011 01:14:31 +0000 Received: from eris.apache.org (localhost [127.0.0.1]) by eris.apache.org (Postfix) with ESMTP id E20B823889CB for ; Tue, 15 Nov 2011 01:14:10 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r1201992 - in /hbase/trunk/src/docbkx: book.xml performance.xml Date: Tue, 15 Nov 2011 01:14:10 -0000 To: commits@hbase.apache.org From: dmeil@apache.org X-Mailer: svnmailer-1.0.8-patched Message-Id: <20111115011410.E20B823889CB@eris.apache.org> Author: dmeil Date: Tue Nov 15 01:14:10 2011 New Revision: 1201992 URL: http://svn.apache.org/viewvc?rev=1201992&view=rev Log: HBASE-4786 book.xml,performance.xml adding and reorg of schema info Modified: hbase/trunk/src/docbkx/book.xml hbase/trunk/src/docbkx/performance.xml Modified: hbase/trunk/src/docbkx/book.xml URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/book.xml?rev=1201992&r1=1201991&r2=1201992&view=diff ============================================================================== --- hbase/trunk/src/docbkx/book.xml (original) +++ hbase/trunk/src/docbkx/book.xml Tue Nov 15 01:14:10 2011 @@ -545,7 +545,8 @@ admin.modifyColumn(table, cf2 ); // m admin.enableTable(table); See for more information about configuring client connections. - + Note: online schema changes are supported in the 0.92.x codebase, but the 0.90.x codebase requires the table + to be disabled.
@@ -739,17 +740,6 @@ System.out.println("md5 digest as string
-
- - In-Memory ColumnFamilies - - ColumnFamilies can optionally be defined as in-memory. Data is still persisted to disk, just like any other ColumnFamily. - In-memory blocks have the highest priority in the , but it is not a guarantee that the entire table - will be in memory. - - See HColumnDescriptor for more information. - -
Time To Live (TTL) ColumnFamilies can set a TTL length in seconds, and HBase will automatically delete rows once the expiration time is reached. @@ -775,20 +765,6 @@ System.out.println("md5 digest as string See HColumnDescriptor for more information.
-
- Bloom Filters - Bloom Filters can be enabled per-ColumnFamily. - Use HColumnDescriptor.setBloomFilterType(NONE | ROW | - ROWCOL) to enable blooms per Column Family. Default = - NONE for no bloom filters. If - ROW, the hash of the row will be added to the bloom - on each insert. If ROWCOL, the hash of the row + - column family + column family qualifier will be added to the bloom on - each key insert. - See HColumnDescriptor and - for more information. - -
Secondary Indexes and Alternate Query Paths @@ -874,6 +850,11 @@ System.out.println("md5 digest as string </para> </section> </section> + <section xml:id="schema.ops"><title>Operational and Performance Configuration Options + See the Performance section for more information operational and performance + schema design options, such as Bloom Filters, Table-configured regionsizes, and blocksizes. + +
Modified: hbase/trunk/src/docbkx/performance.xml URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/performance.xml?rev=1201992&r1=1201991&r2=1201992&view=diff ============================================================================== --- hbase/trunk/src/docbkx/performance.xml (original) +++ hbase/trunk/src/docbkx/performance.xml Tue Nov 15 01:14:10 2011 @@ -140,10 +140,13 @@ The number of regions for an HBase table is driven by the . Also, see the architecture section on - A lower number of regions is preferred, generally in the range of 20 to 200 - per RegionServer. Adjust the regionsize as appropriate to achieve this number. There - are some clusters that set the regionsize to 20Gb, for example, so you may need to - experiment with this setting based on your hardware configuration and application needs. + A lower number of regions is preferred, generally in the range of 20 to low-hundreds + per RegionServer. Adjust the regionsize as appropriate to achieve this number. + + For the 0.90.x codebase, the upper-bound of regionsize is about 4Gb. + For 0.92.x codebase, due to the HFile v2 change much larger regionsizes can be supported (e.g., 20Gb). + + You may need to experiment with this setting based on your hardware configuration and application needs. @@ -155,12 +158,6 @@ something you want to consider. -
- Compression - Production systems should use compression with their column family definitions. See for more information. - -
-
<varname>hbase.regionserver.handler.count</varname> See . @@ -218,7 +215,52 @@ Key and Attribute Lengths See .
- +
Table RegionSize + The regionsize can be set on a per-table basis via setFileSize on + HTableDescriptor in the + event where certain tables require different regionsizes than the configured default regionsize. + + See for more information. + +
+
+ Bloom Filters + Bloom Filters can be enabled per-ColumnFamily. + Use HColumnDescriptor.setBloomFilterType(NONE | ROW | + ROWCOL) to enable blooms per Column Family. Default = + NONE for no bloom filters. If + ROW, the hash of the row will be added to the bloom + on each insert. If ROWCOL, the hash of the row + + column family + column family qualifier will be added to the bloom on + each key insert. + See HColumnDescriptor and + for more information. + +
+
ColumnFamily BlockSize + The blocksize can be configured for each ColumnFamily in a table, and this defaults to 64k. Larger cell values require larger blocksizes. + There is an inverse relationship between blocksize and the resulting StoreFile indexes (i.e., if the blocksize is doubled then the resulting + indexes should be roughly halved). + + See HColumnDescriptor + and for more information. + +
+
+ In-Memory ColumnFamilies + ColumnFamilies can optionally be defined as in-memory. Data is still persisted to disk, just like any other ColumnFamily. + In-memory blocks have the highest priority in the , but it is not a guarantee that the entire table + will be in memory. + + See HColumnDescriptor for more information. + +
+
+ Compression + Production systems should use compression with their ColumnFamily definitions. See for more information. + +
+
Writing to HBase