hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Kennedy <james.kenn...@troove.net>
Subject Re: HBase Performance Q's
Date Thu, 21 Jun 2007 00:28:05 GMT
Ok St.Ack,

I've re-profiled with your recent updates.  Overall the benchmark seems 
4.1% faster averaged over several runs.  I can't say i noticed a 
significant improvement in the readAndProcess()/ObjectWritable.write() 
methods. I'll definately add a vote to Hadoop-414.

Definately i notice that UTF8.write() takes up 78% of the overall write 
time and is 50% the amount of time that the 100061 calls to 
HReginServer.next() take.

UTF8.readString() takes up only 15% with 
sun.nio.ch.SocketChannelImpl.read(ByteBuffer)  (41%) being the biggest 
piece of the pie.


Michael Stack wrote:
> I recently spent some time profiling hbase writing.  I haven't yet 
> spent time on reading.  The patch in HADOOP-1498 has fixes that came 
> of the write profiling sessions.  After applying the patch, running 
> the http://wiki.apache.org/lucene-hadoop/Hbase/PerformanceEvaluation 
> write tests, we seem to be spending > 90% of allocations and CPU time 
> in RPC reading and writing instances of the deprecated UTF8 class (I 
> believe HADOOP-414 is the issue that covers removing UTF8 from the 
> RPC.  If you are seeing the same phenomenon, you might add a vote so 
> the issue gets pegged at a higher priority).  Does the HADOOP-1498 
> patch help with the performance profile you are seeing in hbase?
>
> More below...
>
> James Kennedy wrote:
>> ...
>> Obervations:
>>
>> Within the combined RPC$server.call() of all Handler threads, 
>> HRegionServer.next() took about 90% in both the 10K and 100K cases.  
>> This is as expected since it's called 10K/100K times. The biggest 
>> bottleneck bellow that is HStoreKey.extractFamily() which is 33%/39% 
>> of HRegionServer.next() which is more expensive than all the IO 
>> operations in HStoreScanner.getNext() (31%/31%). Couldn't there be a 
>> cheaper way to extract family from a col key?
>
> The HADOOP_1498 patch redoes extractFamily so it uses Text#find and 
> then a byte array copy to produce the column family.  It used to do a 
> toString followed by an indexOf, substring, and then a new Text out of 
> the produced substring.
>
>> Outside the RPC$server.call(),
>>
>> Server$Connection.readAndProcess() takes about 17%/22% of the time 
>> that HRegionServer.next() takes suggesting a heavy overhead in RPC IO 
>> and unmarshalling.  Overhead in the other direction, 
>> ObjectWritable.write() + DataOutputStream.flush() take 63%/57% of the 
>> time that HRegionServer.next() takes.  The ratio of input/output 
>> makes sense since more data is returned than sent. But this seems 
>> like a lot of RPC overhead.
>>
>
> Yes.  See my comment above on what I'm seeing writing.
>
> ...
>>
>> Right now, if i assume that YourKit is counting block time and I 
>> ignore those numbers, I come to the following conclusions:
>>
>> It looks like HStoreKey.extractFamily() is a bottleneck.  For 100k 
>> rows and 10 cols, it will be called over 1,000,000 times during a 
>> full table scan (even if just retrieving one col) and each call 
>> requires multiple byte[] <-> string conversions.
>>
> If you have suggestions for how to improve on the version in 
> HADOOP-1498, would love to hear them.
>
>> RPC overhead can be quite high during a scan. Could 
>> batched/read-ahead scanning be useful to minimize RPC calls?
>
> Moving from UTF8 to Text would help with time spent in RPC some going 
> by some primitive tests I ran locally.
>
> Batching, as per your suggestion in HADOOP-1468, would also help.
>
> St.Ack
>
>
>
>>
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> <?xml version="1.0"?>
>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>> <configuration>
>>     <property>
>>         <name>hbase.master</name>
>>         <value>172.30.20.15:60000</value>
>>         <description>The host and port that the HBase master runs at.
>>             TODO: Support 'local' (All running in single context).
>>         </description>
>>     </property>
>>     <property>
>>         <name>hbase.regionserver</name>
>>         <value>172.30.20.15:60010</value>
>>         <description>The host and port a HBase region server runs at.
>>         </description>
>>     </property>
>>     <property>
>>         <name>hbase.regiondir</name>
>>         <value>hbase</value>
>>         <description>The directory shared by region servers.
>>         </description>
>>     </property>
>>     <property>
>>         <name>hbase.hregion.max.filesize</name>
>>         <value>134217728</value>
>>         <description>
>>             Maximum desired file size for an HRegion.  If filesize 
>> exceeds
>>             value + (value / 2), the HRegion is split in two.  
>> Default: 128M (134217728).
>>         </description>
>>     </property>
>>     <property>
>>         <name>hbase.hregion.compactionThreshold</name>
>>         <value>10</value>
>>         <description>
>>             By default, we compact the region if an HStore has more 
>> than 10 map files.
>>         </description>
>>     </property>
>>     <property>
>>         
>> <name>hbase.regionserver.thread.splitcompactcheckfrequency</name>
>>         <value>40000</value>
>>         <description>
>>             Default: 60 * 1000
>>         </description>
>>     </property>
>>         <property>
>>         <name>hbase.client.pause</name>
>>         <value>2000</value>
>>     </property>
>>     <property>
>>         <name>hbase.client.timeout.number</name>
>>         <value>30000</value>
>>         <description>Try this many timeouts before giving up.
>>         </description>
>>     </property>
>>     <property>
>>         <name>hbase.client.retries.number</name>
>>         <value>15</value>
>>         <description>Count of maximum retries fetching the root 
>> region from root
>>             region server.
>>         </description>
>>     </property>
>>     <property>
>>         <name>hbase.master.meta.thread.rescanfrequency</name>
>>         <value>60000</value>
>>         <description>How long the HMaster sleeps (in milliseconds) 
>> between scans of
>>             the root and meta tables.
>>         </description>
>>     </property>
>>     <property>
>>         <name>hbase.master.lease.period</name>
>>         <value>30000</value>
>>         <description>HMaster server lease period in milliseconds. 
>> Default is
>>             30 seconds.</description>
>>     </property>
>>     <property>
>>         <name>hbase.server.thread.wakefrequency</name>
>>         <value>5000</value>
>>         <description>Time to sleep in between searches for work (in 
>> milliseconds).
>>             Used as sleep interval by service threads such as META 
>> scanner and log roller.
>>         </description>
>>     </property>
>>     <property>
>>         <name>hbase.regionserver.lease.period</name>
>>         <value>30000</value>
>>         <description>HRegion server lease period in milliseconds. 
>> Default is
>>             30 seconds.</description>
>>     </property>
>>     <property>
>>         <name>hbase.regionserver.handler.count</name>
>>         <value>10</value>
>>         <description>Count of RPC Server instances spun up on 
>> RegionServers
>>             Same property is used by the HMaster for count of master 
>> handlers.
>>             Default is 10.
>>         </description>
>>     </property>
>>     <property>
>>         <name>hbase.regionserver.msginterval</name>
>>         <value>500</value>
>>         <description>Interval between messages from the RegionServer 
>> to HMaster
>>             in milliseconds.  Default is 15 sec. Set this value low 
>> if you want unit
>>             tests to be responsive.
>>         </description>
>>     </property>
>>     </configuration>
>>   
>


Mime
View raw message