hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: importing a large table
Date Fri, 30 Mar 2012 04:08:00 GMT
On Thu, Mar 29, 2012 at 7:57 PM, Rita <rmorgan466@gmail.com> wrote:
> Hello,
> I am importing a 40+ billion row table which I exported several months ago.
> The data size is close to 18TB on hdfs (3x replication).

Does the table from back then still exist?  Or do you remember what
the key spread was like?  Could you precreate the old table?

> My problem is when I try to import it with mapreduce it takes a few days --
> which is ok -- however when the job fails to whatever reason, I have to
> restart everything. Is it possible to import the table in chunks like,
> import 1/3, 2/3, and then finally 3/3  of the table?

Yeah.  Funny how the plug gets pulled on the rack when the three day
job is at the end 95% done.

> Btw, the jobs creates close to 150k mapper jobs, thats a problem waiting to
> happen :-)

Are you running 0.92?  If not, you should and go for bigger regions.   10G?


View raw message