hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: how to do parallel scanning in map reduce using hbase as input?
Date Thu, 26 Jun 2014 13:11:17 GMT
80 regions over 5 nodes - that's 16 per server. 

How big is average region size ?
Have you considered splitting existing regions ?

Cheers

On Jun 26, 2014, at 12:34 AM, Li Li <fancyerii@gmail.com> wrote:

> my table has about 700 million rows and about 80 regions. each task
> tracker is configured with 4 mappers and 4 reducers at the same time.
> The hadoop/hbase cluster has 5 nodes so at the same time, it has 20
> mappers running. it takes more than an hour to finish mapper stage.
> The hbase cluster's load is very low, about 2,000 request per second.
> I think one mapper for a region is too small. How can I run more than
> one mapper for a region so that it can take full advantage of
> computing resources?

Mime
View raw message