hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: HBase export limit bandwith
Date Thu, 05 Jun 2014 12:11:52 GMT

So when the basic tools don't work... 
How about roll your own? 

Step 1 take a snapshot and write the file(s) to a different location outside of /hbase. 
(Export to local disk on the cluster)

Step 2 write your own M/R job and control the number of mappers who read from HDFS and write
to S3. 
Assuming you want a block for block match. If you want to change the #files since each region
would be a separate file, you could do the write to S3 in the reduce phase. 
(Which is what you want.) 

On Jun 4, 2014, at 7:39 AM, Damien Hardy <dhardy@viadeoteam.com> wrote:

> Hello,
> We are trying to export HBase table on S3 for backup purpose.
> By default export tool run a map per region and we want to limit output
> bandwidth on internet (to amazon s3).
> We were thinking in adding some reducer to limit the number of writers
> but this is explicitly hardcoded to 0 in Export class
> ```
>    // No reducers. Just write straight to output files.
>    job.setNumReduceTasks(0);
> ```
> Is there an other way (propertie?) in hadoop to limit output bandwidth ?
> -- 
> Damien

The opinions expressed here are mine, while they may reflect a cognitive thought, that is
purely accidental. 
Use at your own risk. 
Michael Segel
michael_segel (AT) hotmail.com

View raw message