hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Sela <am...@infolinks.com>
Subject Re: Bulk load moving HFiles to the wrong region
Date Mon, 16 Dec 2013 14:22:57 GMT
RegionServer logs in the RegionServer that the files are moved to indeed
shows that all files are moved to that region (when it doesn't happen it
shows only 1 file per family moved to a RegionServer)


On Mon, Dec 16, 2013 at 4:21 PM, Amit Sela <amits@infolinks.com> wrote:

> In the first step, the files are read correctly and regionGroups is
> creates as it should.
> When debugging, in LoadIncrementalHFiles.tryAtomicRegionLoad() I notice
> that ServerCallable's regionName returned from server is the wrong region
> (the pre-split last region).
> The previous last region is not supposed to delete I'm just adding new
> regions (always following lexicographically) so that the last region before
> the pre-split is not the last anymore.
> It seems that wherever the ServerCallable is running, it is not updated
> with the new regions... I tried major compacting (the new regions) after
> pre-split and before the bulkload, but that didn't help.
>
>
>
> On Mon, Dec 16, 2013 at 3:07 PM, Bijieshan <bijieshan@huawei.com> wrote:
>
>> As we know, bulk load has two steps:
>> 1. Create HFiles by MapReduce.
>> 2. Load HFiles into HBase.
>>
>> I wonder whether it read the right partitions information during the
>> first step. Have you run hbck tool to check the cluster healthy?
>> You mentioned you see the new regions in the webapp. The files were moved
>> to the previous old region indicated the old region directory was still
>> there. So you started bulk load just after region split? (Old region
>> directory will be deleted soon by CatalogJanitor after region-split once
>> compaction finished)
>>
>> I suggest to check the regionserver logs.
>>
>> Jieshan.
>> -----Original Message-----
>> From: Amit Sela [mailto:amits@infolinks.com]
>> Sent: Monday, December 16, 2013 2:29 PM
>> To: user@hbase.apache.org
>> Subject: RE: Bulk load moving HFiles to the wrong region
>>
>> Every split executed is a new day. The row key design is yyyyMMdd_URL.
>> And the split points are yyyyMMdd_x, yyyyMMdd_y etc. In a way that the
>> entire load is (almost) evenly spread.
>> The problem I described causes the bulk load to load all files to to the
>> last region of the previous day.
>> Thanks.
>> On Dec 16, 2013 3:43 AM, "Bijieshan" <bijieshan@huawei.com> wrote:
>>
>> > Hi Amit:
>> > Can you provide the split-keys of the new regions and your row-key
>> design?
>> >
>> > Thank you.
>> > Jieshan.
>> > -----Original Message-----
>> > From: Amit Sela [mailto:amits@infolinks.com]
>> > Sent: Monday, December 16, 2013 7:09 AM
>> > To: user@hbase.apache.org
>> > Subject: Bulk load moving HFiles to the wrong region
>> >
>> > Hi all,
>> > I'm using Hadoop 1.0.4 and HBase 0.94.12.
>> > When trying to bulk load using the Java API I sometimes get the HFiles
>> > moved to the wrong directory.
>> > I'm pre-splitting regions and the new regions are always the last
>> > (lexicographically), so when this happens all files move to the last
>> > region pre-split. But the split does work. I see the new regions in
>> > the webapp before bulk load executes. Once a table has this problem
>> > (not all the time) it keeps on until I restart HBase.
>> >
>> > Anyone seen something similar ?
>> >
>> > Thanks,
>> > Amit.
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message