hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Bulkload into empty table with configureIncrementalLoad()
Date Thu, 19 Sep 2013 16:55:46 GMT
You need to create the table with pre-splits, see


On Thu, Sep 19, 2013 at 9:52 AM, Dolan Antenucci <antenucci.d@gmail.com>wrote:

> I have about 1 billion values I am trying to load into a new HBase table
> (with just one column and column family), but am running into some issues.
>  Currently I am trying to use MapReduce to import these by first converting
> them to HFiles and then using LoadIncrementalHFiles.doBulkLoad().  I also
> use HFileOutputFormat.configureIncrementalLoad() as part of my MR job.  My
> code is essentially the same as this example:
> https://github.com/Paschalis/HBase-Bulk-Load-Example/blob/master/src/cy/ac/ucy/paschalis/hbase/bulkimport/Driver.java
> The problem I'm running into is that only 1 reducer is created
> by configureIncrementalLoad(), and there is not enough space on this node
> to handle all this data.  configureIncrementalLoad() should start one
> reducer for every region the table has, so apparently the table only has 1
> region -- maybe because it is empty and brand new (my understanding of how
> regions work is not crystal clear)?  The cluster has 5 region servers, so
> I'd at least like that many reducers to handle this loading.
> On a side note, I also tried the command line tool, completebulkload, but
> am running into other issues with this (timeouts, possible heap issues) --
> probably due to only one server being assigned the task of inserting all
> the records (i.e. I look at the region servers' logs, and only one of the
> servers has log entries; the rest are idle).
> Any help is appreciated
> -Dolan Antenucci

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message