hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Sela <am...@infolinks.com>
Subject Re: Bulk load moving HFiles to the wrong region
Date Mon, 16 Dec 2013 14:29:07 GMT
I ran the hbck tool, and while I do have some inconsistencies they are not
in the table that has the bulk load issues.



On Mon, Dec 16, 2013 at 4:22 PM, Amit Sela <amits@infolinks.com> wrote:

> RegionServer logs in the RegionServer that the files are moved to indeed
> shows that all files are moved to that region (when it doesn't happen it
> shows only 1 file per family moved to a RegionServer)
>
>
> On Mon, Dec 16, 2013 at 4:21 PM, Amit Sela <amits@infolinks.com> wrote:
>
>> In the first step, the files are read correctly and regionGroups is
>> creates as it should.
>> When debugging, in LoadIncrementalHFiles.tryAtomicRegionLoad() I notice
>> that ServerCallable's regionName returned from server is the wrong region
>> (the pre-split last region).
>> The previous last region is not supposed to delete I'm just adding new
>> regions (always following lexicographically) so that the last region before
>> the pre-split is not the last anymore.
>> It seems that wherever the ServerCallable is running, it is not updated
>> with the new regions... I tried major compacting (the new regions) after
>> pre-split and before the bulkload, but that didn't help.
>>
>>
>>
>> On Mon, Dec 16, 2013 at 3:07 PM, Bijieshan <bijieshan@huawei.com> wrote:
>>
>>> As we know, bulk load has two steps:
>>> 1. Create HFiles by MapReduce.
>>> 2. Load HFiles into HBase.
>>>
>>> I wonder whether it read the right partitions information during the
>>> first step. Have you run hbck tool to check the cluster healthy?
>>> You mentioned you see the new regions in the webapp. The files were
>>> moved to the previous old region indicated the old region directory was
>>> still there. So you started bulk load just after region split? (Old region
>>> directory will be deleted soon by CatalogJanitor after region-split once
>>> compaction finished)
>>>
>>> I suggest to check the regionserver logs.
>>>
>>> Jieshan.
>>> -----Original Message-----
>>> From: Amit Sela [mailto:amits@infolinks.com]
>>> Sent: Monday, December 16, 2013 2:29 PM
>>> To: user@hbase.apache.org
>>> Subject: RE: Bulk load moving HFiles to the wrong region
>>>
>>> Every split executed is a new day. The row key design is yyyyMMdd_URL.
>>> And the split points are yyyyMMdd_x, yyyyMMdd_y etc. In a way that the
>>> entire load is (almost) evenly spread.
>>> The problem I described causes the bulk load to load all files to to the
>>> last region of the previous day.
>>> Thanks.
>>> On Dec 16, 2013 3:43 AM, "Bijieshan" <bijieshan@huawei.com> wrote:
>>>
>>> > Hi Amit:
>>> > Can you provide the split-keys of the new regions and your row-key
>>> design?
>>> >
>>> > Thank you.
>>> > Jieshan.
>>> > -----Original Message-----
>>> > From: Amit Sela [mailto:amits@infolinks.com]
>>> > Sent: Monday, December 16, 2013 7:09 AM
>>> > To: user@hbase.apache.org
>>> > Subject: Bulk load moving HFiles to the wrong region
>>> >
>>> > Hi all,
>>> > I'm using Hadoop 1.0.4 and HBase 0.94.12.
>>> > When trying to bulk load using the Java API I sometimes get the HFiles
>>> > moved to the wrong directory.
>>> > I'm pre-splitting regions and the new regions are always the last
>>> > (lexicographically), so when this happens all files move to the last
>>> > region pre-split. But the split does work. I see the new regions in
>>> > the webapp before bulk load executes. Once a table has this problem
>>> > (not all the time) it keeps on until I restart HBase.
>>> >
>>> > Anyone seen something similar ?
>>> >
>>> > Thanks,
>>> > Amit.
>>> >
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message