hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: A proposal for Provide key range support to bulkload to avoid too many reducers (HBASE-9556)
Date Sat, 05 Mar 2016 17:53:48 GMT
I issued the same command in hbase shell and got 4 regions (see tail of
this email).
The values given for SPLITS parameter designate the start keys of regions.
The first region has empty start key.

For #2, getRegionLocations() returns a Map. You can use the following
method to retrieve number of regions:

https://docs.oracle.com/javase/7/docs/api/java/util/Map.html#size()

test,,1457200138927.3daccd0d6f9eb42b25625ea09b5e0e35.
test,a,1457200138927.5714dd320e470add17a566c2154b47eb.
test,b,1457200138927.01d01fa4592521d195ac4a7182e7b059.
test,c,1457200138927.334346d5892afc859833dad734353d9b.

On Sat, Mar 5, 2016 at 9:42 AM, beeshma r <beeshma48@gmail.com> wrote:

> HI Ted ,
>
> Regarding for this  Fix  HBASE-9556 .while I testing with pre- split
> table
> i.e
>
> *create 'test', 'cf', SPLITS=> ['a', 'b', 'c'**]* =>it should  create 3 regions.
>
> So for this case i created logic to find start keys of regions.
>
> HTable ht=new HTable(con,"test"); // Table object
> NavigableMap<HRegionInfo,ServerName> np=ht.getRegionLocations();			
> Set<HRegionInfo> setinfo=np.keySet();
> List<HRegionInfo> lis=new ArrayList<HRegionInfo>();
> lis.addAll(setinfo);
> for(org.apache.hadoop.hbase.HRegionInfo h :lis)
>     		{
>     			System.out.println(h.getRegionId() + "getRegionId");
>     			
>     			String s = new String(h.getStartKey());
>
> 			System.out.println(s.toString()+"-------start key");
> 		}
>
> As per above code logic i got 4 regions( 4 RegionId's) One is with empty start key and
end key remaining start keys are started like a,b,c as respective regions
>
> My question are
>
> 1.How many Region the below command will create?
> *create 'test', 'cf', SPLITS=> ['a', 'b', 'c'**]*
>
> 2.To find exact number for regions can i use RegionID counts?
>
>
> cheers
>
> Beeshma
>
>
>
> On Thu, Jul 30, 2015 at 9:57 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> The following API doesn't contain start / end keys:
>> List<InputSplit> getSplits(JobContext context)
>>
>> You need to pass key range information.
>>
>> I suggest continue discussion on the JIRA.
>>
>> Cheers
>>
>> On Thu, Jul 30, 2015 at 9:50 AM, beeshma r <beeshma48@gmail.com> wrote:
>>
>> > HI,
>> >
>> > i'd like work with key range support to bulkload to avoid too many
>> reducers
>> > mentioned in with these issues (HBASE-9556,HBASE-4063)
>> >
>> > Description and high level design for  proposed solution
>> >
>> > Currently while we loading bulk data in to Hbase through Mapredue in
>> form
>> > of TableInputFormatBase the number of splits matches the number of
>> regions
>> > in a table
>> > so Here i am going to change the process TableInputFormatBase deceides
>> > range for key splits
>> >  For example if input data going to load data in 50 regions(Actullay RS
>> has
>> > 400 Regions)
>> >
>> >    - List<InputSplit> getSplits(JobContext context) will find  50 exact
>> >    list of splits (Currently it returns 400 )
>> >
>> >
>> > Am i understand correctly? please let me know if Am I on the wrong track
>> > .Any one is willing to mentor me because i am new to ASF
>> >
>> > Thanks
>> > Beeshma
>> >
>>
>
>
>
> --
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message