hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ophir Cohen <oph...@gmail.com>
Subject Re: Bulk upload
Date Thu, 11 Aug 2011 08:28:06 GMT
I did some more tests and found the problem: on local run the distribtued
cache does not work.

On full cluster it works.
Sorry for your time...
Ophir

PS
Is there any way to use distributed cache localy as well (i.e. when I'm
running MR from intellijIdea )?

On Thu, Aug 11, 2011 at 11:20, Ophir Cohen <ophchu@gmail.com> wrote:

> Now I see that it uses the distributed cache - but for some reason
> the TotalOrderPartitioner does not grab it.
> Ophir
>
>
> On Thu, Aug 11, 2011 at 11:08, Ophir Cohen <ophchu@gmail.com> wrote:
>
>> Hi,
>> I started to use bulk upload and encounter a strange problem.
>> I'm using Cloudera cdh3-u1.
>>
>> I'm using  HFileOutputFormat.configureIncrementalLoad() to configure my
>> job.
>> This method create partition file for the TotalOrderPartitioner and save
>> it to HDFS.
>>
>> When the TotalOrderPartitioner initiated it tries to find the path for the
>> file in the configuration:
>> public static String getPartitionFile(Configuration conf) {
>>     return conf.get(PARTITIONER_PATH, DEFAULT_PATH);
>>   }
>>
>> The strange thing is that this parameter never assigned!
>> It looks to me that it should have configured
>> in  HFileOutputFormat.configureIncrementalLoad() but it does not!
>>
>> Then it takes the default ("_part") or something similar and (of course)
>> does not find it...
>>
>> BTW
>> When I manually add this parameter it works great.
>>
>> Is that a bug or do I miss something?
>> Thanks,
>> Ophir
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message