hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: How to split a specified number of rows per Map
Date Sun, 05 Jun 2011 14:31:36 GMT
You need to modify getSplits().

On Sun, Jun 5, 2011 at 4:04 AM, edward choi <mp2893@gmail.com> wrote:

> Hi,
>
> I am using HBase as a source of my MapReduce jobs.
>
> I recently found out that TableInputFormat automatically splits the input
> table so that each region of the table will be assigned to a single Map
> job.
>
> But what I want to do is to split the input table so that user-specified
> lines of row will be assigned to each Mapper.
>
> For example, if I set a certain parameter to 100, then each Mapper will get
> 100 lines from the input Table.
>
> Is there a method for this kind of operation?
> Or do I have to modify the getSplits() of
> org.apache.hadoop.hbase.mapreduce.TableInputFormatBase?
>
> Any answer or opinion will be much appreciated!!
>
> Ed
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message