hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: A proposal for Provide key range support to bulkload to avoid too many reducers (HBASE-9556)
Date Thu, 30 Jul 2015 16:57:29 GMT
The following API doesn't contain start / end keys:
List<InputSplit> getSplits(JobContext context)

You need to pass key range information.

I suggest continue discussion on the JIRA.


On Thu, Jul 30, 2015 at 9:50 AM, beeshma r <beeshma48@gmail.com> wrote:

> HI,
> i'd like work with key range support to bulkload to avoid too many reducers
> mentioned in with these issues (HBASE-9556,HBASE-4063)
> Description and high level design for  proposed solution
> Currently while we loading bulk data in to Hbase through Mapredue in form
> of TableInputFormatBase the number of splits matches the number of regions
> in a table
> so Here i am going to change the process TableInputFormatBase deceides
> range for key splits
>  For example if input data going to load data in 50 regions(Actullay RS has
> 400 Regions)
>    - List<InputSplit> getSplits(JobContext context) will find  50 exact
>    list of splits (Currently it returns 400 )
> Am i understand correctly? please let me know if Am I on the wrong track
> .Any one is willing to mentor me because i am new to ASF
> Thanks
> Beeshma

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message