hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chao Shi (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-10305) Batch update performance drops as the number of regions grows
Date Thu, 09 Jan 2014 06:03:51 GMT
Chao Shi created HBASE-10305:

             Summary: Batch update performance drops as the number of regions grows
                 Key: HBASE-10305
                 URL: https://issues.apache.org/jira/browse/HBASE-10305
             Project: HBase
          Issue Type: Bug
          Components: Performance
            Reporter: Chao Shi

In our use case, we use a small number (~5) of proxy programs that read from a queue and batch
update to HBase. Our program is multi-threaded and HBase client will batch mutations to each

We found we're getting lower TPS when there are more regions. I think the reason is RS syncs
HLog for each region. Suppose there is a single region, the batch update will only touch one
region and therefore syncs HLog once. And suppose there are 10 regions per server, in RS#multi()
it have to process update for each individual region and sync HLog 10 times.

Please note that in our scenario, batched mutations usually are independent with each other
and need to touch a various number of regions.

We are using the 0.94 series, but I think the trunk should have the same problem after a quick
look into the code.

This message was sent by Atlassian JIRA

View raw message