hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Gray <jg...@facebook.com>
Subject RE: Problem with bulk incremental loads..
Date Sat, 11 Sep 2010 00:18:48 GMT
Not sure I follow.

When you do the reduce, you bucket according to the current regions on the cluster at that
time.

If between that time and when the job finishes and you run the import a split has happened,
then the files you created could be for a region which no longer exists.  You need to split
the file you created in the current region boundaries.

If you don't have any splitting and this still happened, something completely different is
wrong.

You should open a JIRA for this, yes.

JG

> -----Original Message-----
> From: Vidhyashankar Venkataraman [mailto:vidhyash@yahoo-inc.com]
> Sent: Friday, September 10, 2010 3:20 PM
> To: user@hbase.apache.org
> Subject: Re: Problem with bulk incremental loads..
> 
> JG
>    What I ran into was not an issue with region splits: the created
> Hfile does not fit in the range of any existing region which means a
> new region needs to be created. And in this case, the system goes to
> the mode of repeatedly splitting this Hfile...
>    Shall I report a bug and follow up on it?
> Vidhya
> 
> On 9/10/10 1:42 PM, "Jonathan Gray" <jgray@facebook.com> wrote:
> 
> I ran into something like this as well but were in a rush to get the
> import done so didn't look into it.  I forgot about it so didn't follow
> up.
> 
> We ended up ensuring regions would not be split during the job
> (configuring the split size way up) and reran the MR job.
> 
> JG
> 
> > -----Original Message-----
> > From: Vidhyashankar Venkataraman [mailto:vidhyash@yahoo-inc.com]
> > Sent: Friday, September 10, 2010 11:43 AM
> > To: user@hbase.apache.org; hbase-user@hadoop.apache.org
> > Subject: Problem with bulk incremental loads..
> >
> > I was trying to bulk increment some files into a HBAse (0.89) table
> and
> > found this problem..
> >
> > If a file does not fit into any of the regions in the existing table,
> > then the tool gets into an infinite loop of splitting the files.. I
> > have attached a sample output.. Todd, is this a known issue?
> >
> > Vidhya
> >
> > 10/09/07 01:57:29 INFO mapreduce.LoadIncrementalHFiles: Trying to
> load
> >
> hfile=hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigCo
> > lumn/8375781572558986795 first=0000003511885973 last=0000003511999994
> > 10/09/07 01:57:29 INFO mapreduce.LoadIncrementalHFiles: HFile at
> >
> hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigColumn/8
> > 375781572558986795 no longer fits inside a single region.
> Splitting...
> > 10/09/07 01:57:37 INFO mapreduce.LoadIncrementalHFiles: Successfully
> > split into new HFiles
> >
> hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigColumn/_
> > tmp/fae6ef95297635e32e24c572bec9056e.bottom and
> >
> hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigColumn/_
> > tmp/fae6ef95297635e32e24c572bec9056e.top
> > 10/09/07 01:57:37 INFO mapreduce.LoadIncrementalHFiles: Trying to
> load
> >
> hfile=hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigCo
> > lumn/_tmp/fae6ef95297635e32e24c572bec9056e.top first=0000003511885973
> > last=0000003511999994
> > 10/09/07 01:57:37 INFO mapreduce.LoadIncrementalHFiles: HFile at
> >
> hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigColumn/_
> > tmp/fae6ef95297635e32e24c572bec9056e.top no longer fits inside a
> single
> > region. Splitting...
> > 10/09/07 01:57:44 INFO mapreduce.LoadIncrementalHFiles: Successfully
> > split into new HFiles
> >
> hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigColumn/_
> > tmp/_tmp/fae6ef95297635e32e24c572bec9056e.bottom and
> >
> hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigColumn/_
> > tmp/_tmp/fae6ef95297635e32e24c572bec9056e.top
> > 10/09/07 01:57:44 INFO mapreduce.LoadIncrementalHFiles: Trying to
> load
> >
> hfile=hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigCo
> > lumn/_tmp/_tmp/fae6ef95297635e32e24c572bec9056e.top
> > first=0000003511885973 last=0000003511999994
> > 10/09/07 01:57:44 INFO mapreduce.LoadIncrementalHFiles: HFile at
> >
> hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigColumn/_
> > tmp/_tmp/fae6ef95297635e32e24c572bec9056e.top no longer fits inside a
> > single region. Splitting...
> > 10/09/07 01:57:51 INFO mapreduce.LoadIncrementalHFiles: Successfully
> > split into new HFiles
> >
> hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigColumn/_
> > tmp/_tmp/_tmp/fae6ef95297635e32e24c572bec9056e.bottom and
> >
> hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigColumn/_
> > tmp/_tmp/_tmp/fae6ef95297635e32e24c572bec9056e.top
> > 10/09/07 01:57:51 INFO mapreduce.LoadIncrementalHFiles: Trying to
> load
> >
> hfile=hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigCo
> > lumn/_tmp/_tmp/_tmp/fae6ef95297635e32e24c572bec9056e.top
> > first=0000003511885973 last=0000003511999994
> > 10/09/07 01:57:51 INFO mapreduce.LoadIncrementalHFiles: HFile at
> >
> hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigColumn/_
> > tmp/_tmp/_tmp/fae6ef95297635e32e24c572bec9056e.top no longer fits
> > inside a single region. Splitting...
> > 10/09/07 01:57:59 INFO mapreduce.LoadIncrementalHFiles: Successfully
> > split into new HFiles
> >
> hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigColumn/_
> > tmp/_tmp/_tmp/_tmp/fae6ef95297635e32e24c572bec9056e.bottom and
> >
> hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigColumn/_
> > tmp/_tmp/_tmp/_tmp/fae6ef95297635e32e24c572bec9056e.top
> > 10/09/07 01:57:59 INFO mapreduce.LoadIncrementalHFiles: Trying to
> load
> >
> hfile=hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigCo
> > lumn/_tmp/_tmp/_tmp/_tmp/fae6ef95297635e32e24c572bec9056e.top
> > first=0000003511885973 last=0000003511999994
> > 10/09/07 01:57:59 INFO mapreduce.LoadIncrementalHFiles: HFile at
> >
> hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigColumn/_
> > tmp/_tmp/_tmp/_tmp/fae6ef95297635e32e24c572bec9056e.top no longer
> fits
> > inside a single region. Splitting...
> > 10/09/07 01:58:06 INFO mapreduce.LoadIncrementalHFiles: Successfully
> > split into new HFiles
> >
> hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigColumn/_
> > tmp/_tmp/_tmp/_tmp/_tmp/fae6ef95297635e32e24c572bec9056e.bottom and
> >
> hdfs://b3130080.yst.yahoo.net:4600/user/crawler/docd_inc_v1/bigColumn/_
> > tmp/_tmp/_tmp/_tmp/_tmp/fae6ef95297635e32e24c572bec9056e.top
> 


Mime
View raw message