hbase-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dm...@apache.org
Subject svn commit: r1210608 - in /hbase/trunk/src/docbkx: book.xml external_apis.xml troubleshooting.xml
Date Mon, 05 Dec 2011 20:24:19 GMT
Author: dmeil
Date: Mon Dec  5 20:24:18 2011
New Revision: 1210608

URL: http://svn.apache.org/viewvc?rev=1210608&view=rev
Log:
hbase-4958  book.xml, troubleshooting.xml, external_apis.xml  several cleanup items.

Modified:
    hbase/trunk/src/docbkx/book.xml
    hbase/trunk/src/docbkx/external_apis.xml
    hbase/trunk/src/docbkx/troubleshooting.xml

Modified: hbase/trunk/src/docbkx/book.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/book.xml?rev=1210608&r1=1210607&r2=1210608&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/book.xml (original)
+++ hbase/trunk/src/docbkx/book.xml Mon Dec  5 20:24:18 2011
@@ -1497,6 +1497,12 @@ scan.setFilter(filter);
           to filter based on the lead portion of Column (aka Qualifier) names.
           </para>
         </section>
+        <section xml:id="client.filter.kvm.crf "><title>ColumnRangeFilter</title>
+			<para>Use <link xlink:href="http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/ColumnRangeFilter.html">ColumnRangeFilter</link>
to get a column 'slice':
+			 i.e. if you have a million columns in a row but you only want to look at columns bbbb-bbbd.
+            </para>
+            <para>Note:  Introduced in HBase 0.92</para>
+        </section>
       </section>
       <section xml:id="client.filter.row"><title>RowKey</title>
         <section xml:id="client.filter.row.rf"><title>RowFilter</title>
@@ -2022,13 +2028,11 @@ scan.setFilter(filter);
                 </para>
             </answer>
         </qandaentry>
-        <qandaentry>
-            <question><para>
-                    How can I get a column 'slice': i.e. I have a million columns in my row
but I only want to look at columns bbbb-bbbd?
-            </para></question>
+        <qandaentry xml:id="faq.apis">
+            <question><para>What APIs does HBase support?</para></question>
             <answer>
                 <para>
-                  See <classname>org.apache.hadoop.hbase.filter.ColumnRangeFilter</classname>
   
+                    See <xref linkend="datamodel" />, <xref linkend="client" />
and <xref linkend="nonjava.jvm"/>.
                 </para>
             </answer>
         </qandaentry>
@@ -2064,18 +2068,6 @@ scan.setFilter(filter);
                 </para>
             </answer>
         </qandaentry>
-        <qandaentry xml:id="brand.new.compressor">
-            <question><para>Why are logs flooded with '2011-01-10 12:40:48,407
INFO org.apache.hadoop.io.compress.CodecPool: Got
-            brand-new compressor' messages?</para></question>
-            <answer>
-                <para>
-                    Because we are not using the native versions of compression
-                    libraries.  See <link xlink:href="https://issues.apache.org/jira/browse/HBASE-1900">HBASE-1900
Put back native support when hadoop 0.21 is released</link>.
-                    Copy the native libs from hadoop under hbase lib dir or
-                    symlink them into place and the message should go away.
-                </para>
-            </answer>
-        </qandaentry>
     </qandadiv>
     <qandadiv xml:id="ec2"><title>Amazon EC2</title>
         <qandaentry>
@@ -2248,6 +2240,7 @@ hbase> describe 't1'</programlisting>
    <title>HFile format version 2</title>
 
    <section><title>Motivation </title>
+   <para>Note:  this feature was introduced in HBase 0.92</para>
    <para>We found it necessary to revise the HFile format after encountering high memory
usage and slow startup times caused by large Bloom filters and block indexes in the region
server. Bloom filters can get as large as 100 MB per HFile, which adds up to 2 GB when aggregated
over 20 regions. Block indexes can grow as large as 6 GB in aggregate size over the same set
of regions. A region is not considered opened until all of its block index data is loaded.
Large Bloom filters produce a different performance problem: the first get request that requires
a Bloom filter lookup will incur the latency of loading the entire Bloom filter bit array.</para>
    <para>To speed up region server startup we break Bloom filters and block indexes
into multiple blocks and write those blocks out as they fill up, which also reduces the HFile
writer’s memory footprint. In the Bloom filter case, “filling up a block” means
accumulating enough keys to efficiently utilize a fixed-size bit array, and in the block index
case we accumulate an “index block” of the desired size. Bloom filter blocks and
index blocks (we call these “inline blocks”) become interspersed with data blocks,
and as a side effect we can no longer rely on the difference between block offsets to determine
data block length, as it was done in version 1.</para>
    <para>HFile is a low-level file format by design, and it should not deal with application-specific
details such as Bloom filters, which are handled at StoreFile level. Therefore, we call Bloom
filter blocks in an HFile "inline" blocks. We also supply HFile with an interface to write
those inline blocks. </para>

Modified: hbase/trunk/src/docbkx/external_apis.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/external_apis.xml?rev=1210608&r1=1210607&r2=1210608&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/external_apis.xml (original)
+++ hbase/trunk/src/docbkx/external_apis.xml Mon Dec  5 20:24:18 2011
@@ -50,6 +50,7 @@
     </para>
           <section xml:id="thrift.filter-language"><title>Filter Language</title>
              <section xml:id="use-case"><title>Use Case</title>
+               <para>Note:  this feature was introduced in HBase 0.92</para>
                <para>This allows the user to perform server-side filtering when accessing
HBase over Thrift. The user specifies a filter via a string. The string is parsed on the server
to construct the filter</para>
              </section>
 

Modified: hbase/trunk/src/docbkx/troubleshooting.xml
URL: http://svn.apache.org/viewvc/hbase/trunk/src/docbkx/troubleshooting.xml?rev=1210608&r1=1210607&r2=1210608&view=diff
==============================================================================
--- hbase/trunk/src/docbkx/troubleshooting.xml (original)
+++ hbase/trunk/src/docbkx/troubleshooting.xml Mon Dec  5 20:24:18 2011
@@ -819,6 +819,15 @@ ERROR org.apache.hadoop.hbase.regionserv
            RegionServer is not using the name given it by the master; double entry in master
listing of servers</link> for gorey details.
           </para>
           </section>
+        <section xml:id="trouble.rs.runtime.codecmsgs">
+          <title>Logs flooded with '2011-01-10 12:40:48,407 INFO org.apache.hadoop.io.compress.CodecPool:
Got
+            brand-new compressor' messages</title>
+                <para>We are not using the native versions of compression
+                    libraries.  See <link xlink:href="https://issues.apache.org/jira/browse/HBASE-1900">HBASE-1900
Put back native support when hadoop 0.21 is released</link>.
+                    Copy the native libs from hadoop under hbase lib dir or
+                    symlink them into place and the message should go away.
+                </para>
+        </section>
 
       </section>    
       <section xml:id="trouble.rs.shutdown">



Mime
View raw message