hbase-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From st...@apache.org
Subject svn commit: r1091644 - in /hbase/trunk: CHANGES.txt src/docbkx/performance.xml
Date Wed, 13 Apr 2011 04:33:58 GMT
Author: stack
Date: Wed Apr 13 04:33:58 2011
New Revision: 1091644

URL: http://svn.apache.org/viewvc?rev=1091644&view=rev
Log:
HBASE-3768 Add best practice to book for loading row key only

Modified:
    hbase/trunk/CHANGES.txt
    hbase/trunk/src/docbkx/performance.xml

Modified: hbase/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/hbase/trunk/CHANGES.txt?rev=1091644&r1=1091643&r2=1091644&view=diff
==============================================================================
--- hbase/trunk/CHANGES.txt (original)
+++ hbase/trunk/CHANGES.txt Wed Apr 13 04:33:58 2011
@@ -153,6 +153,8 @@ Release 0.91.0 - Unreleased
                as a convenience (Erik Onnen via Stack)
    HBASE-3769  TableMapReduceUtil is inconsistent with other table-related
                classes that accept byte[] as a table name (Erik Onnen via Stack)
+   HBASE-3768  Add best practice to book for loading row key only
+               (Erik Onnen via Stack)
 
   TASKS
    HBASE-3559  Move report of split to master OFF the heartbeat channel

Modified: hbase/trunk/src/docbkx/performance.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/performance.xml?rev=1091644&r1=1091643&r2=1091644&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/performance.xml (original)
+++ hbase/trunk/src/docbkx/performance.xml Wed Apr 13 04:33:58 2011
@@ -199,5 +199,16 @@ htable.close();</programlisting></para>
       <varname>false</varname>. For frequently accessed rows, it is advisable
to use the block
       cache.</para>
     </section>
+    <section xml:id="perf.hbase.client.rowkeyonly">
+      <title>Optimal Loading of Row Keys</title>
+      <para>When performing a table <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html">scan</link>
+            where only the row keys are needed (no families, qualifiers, values or timestamps),
add a FilterList with a
+            <varname>MUST_PASS_ALL</varname> operator to the scanner using <methodname>setFilter</methodname>.
The filter list
+            should include both a <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html">FirstKeyOnlyFilter</link>
+            and a <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/KeyOnlyFilter.html">KeyOnlyFilter</link>.
+            Using this filter combination will result in a worst case scenario of a region
server reading a single value from disk
+            and minimal network traffic to the client for a single row.
+      </para>
+    </section>
   </section>
 </chapter>



Mime
View raw message