hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: bulk loading regions number
Date Mon, 10 Sep 2012 08:24:17 GMT
Hi Oleg,

If the root issue is a growing number of regions, why not control that
instead of a way to control the Reducer count? You could, for example,
raise the split-point sizes for HFiles, to not have it split too much,
and hence have larger but fewer regions?

Given that you have 10 machines, I'd go this way rather than ending up
with a lot of regions causing issues with load.

On Mon, Sep 10, 2012 at 1:49 PM, Oleg Ruchovets <oruchovets@gmail.com> wrote:
> Hi ,
>   I am using bulk loading to write my data to hbase.
> I works fine , but number of regions growing very rapidly.
> Entering ONE WEEK of data I got  200 regions (I am going to save years of
> data).
> As a result job which writes data to HBase has REDUCERS number equals
> REGIONS number.
> So entering only one WEEK of data I have 200 reducers.
> Questions:
>    How to resolve the problem of constantly growing reducers number using
> bulk loading and TotalOrderPartition.
>  I have 10 machine cluster and I think I should have ~ 30 reducers.
> Thank in advance.
> Oleg.

Harsh J

View raw message