hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Bulk Load Sample Code
Date Sat, 13 Nov 2010 00:12:15 GMT
I'm surprised that HRegionPartitioner works correctly for incremental load.
It definitely won't work if the regions are also shifting during the MR job.

Thanks
-Todd

On Fri, Nov 12, 2010 at 10:10 AM, Adam Phelps <amp@opendns.com> wrote:

> On 11/10/10 11:57 AM, Stack wrote:
>
>> On Wed, Nov 10, 2010 at 11:53 AM, Shuja Rehman<shujamughal@gmail.com>
>>  wrote:
>>
>>> oh! I think u have not read the full post. The essay has 3 paragraphs  :)
>>>
>>> *Should I need to add the following line also
>>>
>>>  job.setPartitionerClass(TotalOrderPartitioner.class);
>>>
>>>
>> You need to specify other than default partitioner so yes, above seems
>> necessary (Be aware that if only one reducer, all may appear to work
>> though your partitioner is bad... its when you have multiple reducers
>> that bad partitioner will show).
>>
>
> I skimmed over this thread as we've been using LoadIncrementalHFiles to
> load the output of our MR jobs, however it looks like we're using
> HRegionPartitioner rather than TotalOrderPartitioner.  The current code is
> definitely working, however the page regarding bulk loads that was posted
> earlier implies that TotalOrderPartitioner is best for efficiency.  What is
> the difference between the two?
>
> - Adam
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message