hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruce Bian <weidong....@gmail.com>
Subject Re: HBase bulk loaded region can't be splitted
Date Fri, 11 May 2012 03:07:12 GMT
Yes, I understand that.
But after I complete the bulk load, shouldn't it trigger the region server
to split that region in order to meet the  *hbase*.*hregion*.*max*.*filesize
* criteria?
When I try to split the regions manually using the WebUI, nothing happened,
but instead a Region mytable,,1334215360439.71611409ea972a65b0876f953ad6377e.
not splittable because midkey=null
message is found in the region server log.


On Fri, May 11, 2012 at 10:56 AM, Bryan Beaudreault <
bbeaudreault@hubspot.com> wrote:

> I haven't done bulk loads using the importtsv tool, but I imagine it works
> similarly to the mapreduce bulk load tool we are provided.  If so, the
> following stands.
>
> In order to do a bulk load you need to have a table ready to accept the
> data.  The bulk load does not create regions, but only puts data into the
> right place based on existing regions.  Since you only have 1 region to
> start with, it makes sense that they would all go to that one region.  You
> should find a way to calculate the regions that you want and create your
> table with pre-created regions.  Then re-run the import.
>
> On Thu, May 10, 2012 at 10:50 PM, Bruce Bian <weidong.ban@gmail.com>
> wrote:
>
> > I use importtsv to load data as HFile
> >
> > hadoop jar hbase-0.92.1.jar importtsv
> > -Dimporttsv.bulk.output=/outputs/mytable.bulk
> > -Dimporttsv.columns=HBASE_ROW_KEY,ns: -Dimporttsv.separator=, mytable
> > /input
> >
> > Then I use completebulkload to load those bulk data into my table
> >
> > hadoop jar hbase-0.92.1.jar completebulkload /outputs/mytable.bulk
> mytable
> >
> > However, the size of table is very huge (4.x GB). And it has only one
> > region. Oddly, why doesn't HBase split it into multiple regions? It did
> > exceed the size to split (256MB).
> >
> > /hbase/mytable/71611409ea972a65b0876f953ad6377e/ns:
> >
> > [image: enter image description here]
> >
> > To split it, I try to use Split button on the Web UI of HBase. Sadly, it
> > shows
> >
> > org.apache.hadoop.hbase.regionserver.CompactSplitThread: Region
> > mytable,,1334215360439.71611409ea972a65b0876f953ad6377e. not
> > splittable because midkey=null
> >
> > I have more data to load. About 300GB, no matter how many data I have
> > loaded, it is still only one region. Also, it is still not splittable.
> Any
> > idea?
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message