hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Stack <st...@duboce.net>
Subject Re: HBase Performance Q's
Date Mon, 18 Jun 2007 20:00:50 GMT
I recently spent some time profiling hbase writing.  I haven't yet spent 
time on reading.  The patch in HADOOP-1498 has fixes that came of the 
write profiling sessions.  After applying the patch, running the 
http://wiki.apache.org/lucene-hadoop/Hbase/PerformanceEvaluation write 
tests, we seem to be spending > 90% of allocations and CPU time in RPC 
reading and writing instances of the deprecated UTF8 class (I believe 
HADOOP-414 is the issue that covers removing UTF8 from the RPC.  If you 
are seeing the same phenomenon, you might add a vote so the issue gets 
pegged at a higher priority).  Does the HADOOP-1498 patch help with the 
performance profile you are seeing in hbase?

More below...

James Kennedy wrote:
> ...
> Obervations:
>
> Within the combined RPC$server.call() of all Handler threads, 
> HRegionServer.next() took about 90% in both the 10K and 100K cases.  
> This is as expected since it's called 10K/100K times. The biggest 
> bottleneck bellow that is HStoreKey.extractFamily() which is 33%/39% 
> of HRegionServer.next() which is more expensive than all the IO 
> operations in HStoreScanner.getNext() (31%/31%). Couldn't there be a 
> cheaper way to extract family from a col key?

The HADOOP_1498 patch redoes extractFamily so it uses Text#find and then 
a byte array copy to produce the column family.  It used to do a 
toString followed by an indexOf, substring, and then a new Text out of 
the produced substring.

> Outside the RPC$server.call(),
>
> Server$Connection.readAndProcess() takes about 17%/22% of the time 
> that HRegionServer.next() takes suggesting a heavy overhead in RPC IO 
> and unmarshalling.  Overhead in the other direction, 
> ObjectWritable.write() + DataOutputStream.flush() take 63%/57% of the 
> time that HRegionServer.next() takes.  The ratio of input/output makes 
> sense since more data is returned than sent. But this seems like a lot 
> of RPC overhead.
>

Yes.  See my comment above on what I'm seeing writing.

...
>
> Right now, if i assume that YourKit is counting block time and I 
> ignore those numbers, I come to the following conclusions:
>
> It looks like HStoreKey.extractFamily() is a bottleneck.  For 100k 
> rows and 10 cols, it will be called over 1,000,000 times during a full 
> table scan (even if just retrieving one col) and each call requires 
> multiple byte[] <-> string conversions.
>
If you have suggestions for how to improve on the version in 
HADOOP-1498, would love to hear them.

> RPC overhead can be quite high during a scan. Could batched/read-ahead 
> scanning be useful to minimize RPC calls?

Moving from UTF8 to Text would help with time spent in RPC some going by 
some primitive tests I ran locally.

Batching, as per your suggestion in HADOOP-1468, would also help.

St.Ack



>
>
>
>
> ------------------------------------------------------------------------
>
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> <configuration>
>     <property>
>         <name>hbase.master</name>
>         <value>172.30.20.15:60000</value>
>         <description>The host and port that the HBase master runs at.
>             TODO: Support 'local' (All running in single context).
>         </description>
>     </property>
>     <property>
>         <name>hbase.regionserver</name>
>         <value>172.30.20.15:60010</value>
>         <description>The host and port a HBase region server runs at.
>         </description>
>     </property>
>     <property>
>         <name>hbase.regiondir</name>
>         <value>hbase</value>
>         <description>The directory shared by region servers.
>         </description>
>     </property>
>     <property>
>         <name>hbase.hregion.max.filesize</name>
>         <value>134217728</value>
>         <description>
>             Maximum desired file size for an HRegion.  If filesize exceeds
>             value + (value / 2), the HRegion is split in two.  Default: 128M (134217728).
>         </description>
>     </property>
>     <property>
>         <name>hbase.hregion.compactionThreshold</name>
>         <value>10</value>
>         <description>
>             By default, we compact the region if an HStore has more than 10 map files.
>         </description>
>     </property>
>     <property>
>         <name>hbase.regionserver.thread.splitcompactcheckfrequency</name>
>         <value>40000</value>
>         <description>
>             Default: 60 * 1000
>         </description>
>     </property>
>     
>     <property>
>         <name>hbase.client.pause</name>
>         <value>2000</value>
>     </property>
>     <property>
>         <name>hbase.client.timeout.number</name>
>         <value>30000</value>
>         <description>Try this many timeouts before giving up.
>         </description>
>     </property>
>     <property>
>         <name>hbase.client.retries.number</name>
>         <value>15</value>
>         <description>Count of maximum retries fetching the root region from root
>             region server.
>         </description>
>     </property>
>     <property>
>         <name>hbase.master.meta.thread.rescanfrequency</name>
>         <value>60000</value>
>         <description>How long the HMaster sleeps (in milliseconds) between scans
of
>             the root and meta tables.
>         </description>
>     </property>
>     <property>
>         <name>hbase.master.lease.period</name>
>         <value>30000</value>
>         <description>HMaster server lease period in milliseconds. Default is
>             30 seconds.</description>
>     </property>
>     <property>
>         <name>hbase.server.thread.wakefrequency</name>
>         <value>5000</value>
>         <description>Time to sleep in between searches for work (in milliseconds).
>             Used as sleep interval by service threads such as META scanner and log roller.
>         </description>
>     </property>
>     <property>
>         <name>hbase.regionserver.lease.period</name>
>         <value>30000</value>
>         <description>HRegion server lease period in milliseconds. Default is
>             30 seconds.</description>
>     </property>
>     <property>
>         <name>hbase.regionserver.handler.count</name>
>         <value>10</value>
>         <description>Count of RPC Server instances spun up on RegionServers
>             Same property is used by the HMaster for count of master handlers.
>             Default is 10.
>         </description>
>     </property>
>     <property>
>         <name>hbase.regionserver.msginterval</name>
>         <value>500</value>
>         <description>Interval between messages from the RegionServer to HMaster
>             in milliseconds.  Default is 15 sec. Set this value low if you want unit
>             tests to be responsive.
>         </description>
>     </property>
>     
> </configuration>
>   


Mime
View raw message