hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Sela <am...@infolinks.com>
Subject Re: Bulk load moving HFiles to the wrong region
Date Mon, 16 Dec 2013 14:21:58 GMT
In the first step, the files are read correctly and regionGroups is creates
as it should.
When debugging, in LoadIncrementalHFiles.tryAtomicRegionLoad() I notice
that ServerCallable's regionName returned from server is the wrong region
(the pre-split last region).
The previous last region is not supposed to delete I'm just adding new
regions (always following lexicographically) so that the last region before
the pre-split is not the last anymore.
It seems that wherever the ServerCallable is running, it is not updated
with the new regions... I tried major compacting (the new regions) after
pre-split and before the bulkload, but that didn't help.



On Mon, Dec 16, 2013 at 3:07 PM, Bijieshan <bijieshan@huawei.com> wrote:

> As we know, bulk load has two steps:
> 1. Create HFiles by MapReduce.
> 2. Load HFiles into HBase.
>
> I wonder whether it read the right partitions information during the first
> step. Have you run hbck tool to check the cluster healthy?
> You mentioned you see the new regions in the webapp. The files were moved
> to the previous old region indicated the old region directory was still
> there. So you started bulk load just after region split? (Old region
> directory will be deleted soon by CatalogJanitor after region-split once
> compaction finished)
>
> I suggest to check the regionserver logs.
>
> Jieshan.
> -----Original Message-----
> From: Amit Sela [mailto:amits@infolinks.com]
> Sent: Monday, December 16, 2013 2:29 PM
> To: user@hbase.apache.org
> Subject: RE: Bulk load moving HFiles to the wrong region
>
> Every split executed is a new day. The row key design is yyyyMMdd_URL. And
> the split points are yyyyMMdd_x, yyyyMMdd_y etc. In a way that the entire
> load is (almost) evenly spread.
> The problem I described causes the bulk load to load all files to to the
> last region of the previous day.
> Thanks.
> On Dec 16, 2013 3:43 AM, "Bijieshan" <bijieshan@huawei.com> wrote:
>
> > Hi Amit:
> > Can you provide the split-keys of the new regions and your row-key
> design?
> >
> > Thank you.
> > Jieshan.
> > -----Original Message-----
> > From: Amit Sela [mailto:amits@infolinks.com]
> > Sent: Monday, December 16, 2013 7:09 AM
> > To: user@hbase.apache.org
> > Subject: Bulk load moving HFiles to the wrong region
> >
> > Hi all,
> > I'm using Hadoop 1.0.4 and HBase 0.94.12.
> > When trying to bulk load using the Java API I sometimes get the HFiles
> > moved to the wrong directory.
> > I'm pre-splitting regions and the new regions are always the last
> > (lexicographically), so when this happens all files move to the last
> > region pre-split. But the split does work. I see the new regions in
> > the webapp before bulk load executes. Once a table has this problem
> > (not all the time) it keeps on until I restart HBase.
> >
> > Anyone seen something similar ?
> >
> > Thanks,
> > Amit.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message