hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From y_823...@tsmc.com
Subject Re: HBase reading performance
Date Tue, 02 Mar 2010 01:37:44 GMT
Hi,

We treat HBASE as a DataGrid.
There are a lot of HBase java client in our Compute Grid(GridGain) to fetch
data from HBASE concurrently.
Our data is normalized data from Oracle, these computing code is to do join
and some aggregations.
So our POC job is to  Loading Tables' data from Hbase -> Compute these data
(join & aggregation) -> Save back to HBase
It's doing very well while we run 10 jobs using 10 concurrent clients , it
took 53 sec.
We expect our 20 machines can gain 60 sec complete time while we run 200
jobs(200 concurrent clients)
but in fact, these clients all blocked in following code:
      IndexedTable idxTable1= new
IndexedTable(config,Bytes.toBytes("Table1"));
The result we are not satisfied as following,
     > > 200  client   839 sec
     > > 400  cleint  1801 sec
We estimated about 85% time took in new IndexedTable while client number up
to 200.
That say HBase can serve well while hundred of client connecting to it
concurrently.
Just new a table in your code then run it concurrently in thread or other
distributing computing platform
that maybe you can see what's wrong with it ?
If Hbase just focuses on a few web server connections that's ok,
but like RDBMS can serve a thousand of concurrent connection, the Hbase
architecture seems need to be adjusted.
That's my opinion!



Fleming Chiu(邱宏明)
707-6128
y_823910@tsmc.com
週一無肉日吃素救地球(Meat Free Monday Taiwan)




                                                                                         
                                                            
                      jdcryans@gmail.co                                                  
                                                            
                      m                        To:      hbase-user@hadoop.apache.org     
                                                            
                      Sent by:                 cc:      bcwalrus@cloudera.com, kevin_hung@tsmc.com,
(bcc: Y_823910/TSMC)                              
                      jdcryans@gmail.co        Subject: Re: HBase reading performance    
                                                            
                      m                                                                  
                                                            
                                                                                         
                                                            
                                                                                         
                                                            
                      2010/03/02 03:25                                                   
                                                            
                      AM                                                                 
                                                            
                      Please respond to                                                  
                                                            
                      hbase-user                                                         
                                                            
                                                                                         
                                                            
                                                                                         
                                                            




In this particular case a lot of things come in action:

- Creating a table is a long process because the client sleeps a lot,
6 seconds before 0.20.3, 2 seconds in 0.20.3 and even less than that
in the current head of branch.

- in 0.20, without the HDFS-200 patch, HDFS doesn't support fs syncs
so we force memstore flushes at something like 8MB so that you don't
lose too much data on that very important table (hopefully in 0.21
it's supported, no data loss yeah!). So all those memstore flushes can
account for a lot of traffic and can generate a lot more compactions.

What exactly is your test trying to show? I'm really not sure... that
tables with very small memstores take edits at a slower rate?

J-D

2010/2/28  <y_823910@tsmc.com>:
> Hi,
> I started 200 clients(spread it to 20 machines) to run NewHTableTest like
> following code, which took 983 seconds.
> META table just resides in one region that machine CPU and network
traffic
> are very high
> while running NewHTableTest,so I guess there is a bottleneck from
Zookeeper
> or META table server.
> Any suggestion?
>
>
> My Cluster:
>   1U servers(4core,12G ram): 20
>   zookeepers     :   3
>   region servers :  10
>   regions        :1500
>
>
> public void NewHTableTest() throws IOException {
>            IndexedTable idxTable1= new IndexedTable(config,
> Bytes.toBytes("Table1"));
>            IndexedTable idxTable2= new IndexedTable(config,
> Bytes.toBytes("Table2"));
>            IndexedTable idxTable3= new IndexedTable(config,
> Bytes.toBytes("Table3"));
>            IndexedTable idxTable4= new IndexedTable(config,
> Bytes.toBytes("Table4"));
>            IndexedTable idxTable5= new IndexedTable(config,
> Bytes.toBytes("Table5"));
>            IndexedTable idxTable6= new IndexedTable(config,
> Bytes.toBytes("Table6"));
>            IndexedTable idxTable7= new IndexedTable(config,
> Bytes.toBytes("Table7"));
>            IndexedTable idxTable8= new IndexedTable(config,
> Bytes.toBytes("Table8"));
>            IndexedTable idxTable9= new IndexedTable(config,
> Bytes.toBytes("Table9"));
>            IndexedTable idxTable10= new IndexedTable(config,
> Bytes.toBytes("Table10"));
>            IndexedTable idxTable11= new IndexedTable(config,
> Bytes.toBytes("Table11"));
>            IndexedTable idxTable12= new IndexedTable(config,
> Bytes.toBytes("Table12"));
>            IndexedTable idxTable13= new IndexedTable(config,
> Bytes.toBytes("Table13"));
>            IndexedTable idxTable14= new IndexedTable(config,
> Bytes.toBytes("Table14"));
>            IndexedTable idxTable15= new IndexedTable(config,
> Bytes.toBytes("Table15"));
>            IndexedTable idxTable16= new IndexedTable(config,
> Bytes.toBytes("Table16"));
>            IndexedTable idxTable17= new IndexedTable(config,
> Bytes.toBytes("Table17"));
>            IndexedTable idxTable18= new IndexedTable(config,
> Bytes.toBytes("Table18"));
>            IndexedTable idxTable19= new IndexedTable(config,
> Bytes.toBytes("Table19"));
>            IndexedTable idxTable20= new IndexedTable(config,
> Bytes.toBytes("Table20"));
>            IndexedTable idxTable21= new IndexedTable(config,
> Bytes.toBytes("Table21"));
>            IndexedTable idxTable22= new IndexedTable(config,
> Bytes.toBytes("Table22"));
>            IndexedTable idxTable23= new IndexedTable(config,
> Bytes.toBytes("Table23"));
>            IndexedTable idxTable24= new IndexedTable(config,
> Bytes.toBytes("Table24"));
>            IndexedTable idxTable25= new IndexedTable(config,
> Bytes.toBytes("Table25"));
>            IndexedTable idxTable26= new IndexedTable(config,
> Bytes.toBytes("Table26"));
>            IndexedTable idxTable27= new IndexedTable(config,
> Bytes.toBytes("Table27"));
>            IndexedTable idxTable28= new IndexedTable(config,
> Bytes.toBytes("Table28"));
>            IndexedTable idxTable29= new IndexedTable(config,
> Bytes.toBytes("Table29"));
>            IndexedTable idxTable30= new IndexedTable(config,
> Bytes.toBytes("Table30"));
>      }
>
>
>
>
> Fleming Chiu(邱宏明)
> 707-6128
> y_823910@tsmc.com
> 週一無肉日吃素救地球(Meat Free Monday Taiwan)
>
>
>
---------------------------------------------------------------------------
>                                                         TSMC PROPERTY
>  This email communication (and any attachments) is proprietary
information
>  for the sole use of its
>  intended recipient. Any unauthorized review, use or distribution by
anyone
>  other than the intended
>  recipient is strictly prohibited.  If you are not the intended
recipient,
>  please notify the sender by
>  replying to this email, and then delete this email and any copies of it
>  immediately. Thank you.
>
---------------------------------------------------------------------------
>
>
>
>




 --------------------------------------------------------------------------- 
                                                         TSMC PROPERTY       
 This email communication (and any attachments) is proprietary information   
 for the sole use of its                                                     
 intended recipient. Any unauthorized review, use or distribution by anyone  
 other than the intended                                                     
 recipient is strictly prohibited.  If you are not the intended recipient,   
 please notify the sender by                                                 
 replying to this email, and then delete this email and any copies of it     
 immediately. Thank you.                                                     
 --------------------------------------------------------------------------- 




Mime
View raw message