hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rita <rmorgan...@gmail.com>
Subject importing a large table
Date Fri, 30 Mar 2012 02:57:11 GMT

I am importing a 40+ billion row table which I exported several months ago.
The data size is close to 18TB on hdfs (3x replication).

My problem is when I try to import it with mapreduce it takes a few days --
which is ok -- however when the job fails to whatever reason, I have to
restart everything. Is it possible to import the table in chunks like,
import 1/3, 2/3, and then finally 3/3  of the table?

Btw, the jobs creates close to 150k mapper jobs, thats a problem waiting to
happen :-)

--- Get your facts first, then you can distort them as you please.--

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message