hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: importing a large table
Date Fri, 30 Mar 2012 04:02:31 GMT
How did you export the table? Using HBase's Export?
If so, in 0.94+ (HBASE-5440, should not be hard to backport of 0.92) you can use Import to
create HFiles for later use with bulk import (LoadIncrementalHFiles).

Creating the HFiles takes far less time, and loading them into HBase in nearly instantaneous
(provided that they were created recent enough, so they do not have to be re-split, because
some of the tables region split).

-- Lars

 From: Rita <rmorgan466@gmail.com>
To: user@hbase.apache.org 
Sent: Thursday, March 29, 2012 7:57 PM
Subject: importing a large table

I am importing a 40+ billion row table which I exported several months ago.
The data size is close to 18TB on hdfs (3x replication).

My problem is when I try to import it with mapreduce it takes a few days --
which is ok -- however when the job fails to whatever reason, I have to
restart everything. Is it possible to import the table in chunks like,
import 1/3, 2/3, and then finally 3/3  of the table?

Btw, the jobs creates close to 150k mapper jobs, thats a problem waiting to
happen :-)

--- Get your facts first, then you can distort them as you please.--
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message