hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sriram Muthuswamy Chittathoor" <srir...@ivycomptech.com>
Subject RE: HBase bulk load
Date Thu, 14 Jan 2010 05:49:14 GMT
I am trying to use this technique to say bulk load 20 billion rows.  I
tried it on a smaller set 20 million rows. A few things I had to take
care was to write a custom partitioning logic so that a range of keys
only go to a particular reduce since there was some mention of global
For example  Users  (1 --  1mill) ---> Reducer 1 and so on

My questions are:
1.  Can I divide the bulk loading into multiple runs  --  the existing
bulk load bails out if it finds a HDFS output directory with the same
2.  What I want to do is make multiple runs of 10 billion and then
combine the output before running  loadtable.rb --  is this possible ?
I am thinking this may be required in case my MR bulk loading fails in
between and I need to start from where I crashed

Any tips with huge bulk loading experience ?

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
Sent: Thursday, January 14, 2010 6:19 AM
To: hbase-user@hadoop.apache.org
Subject: Re: HBase bulk load


On Wed, Jan 13, 2010 at 4:30 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> Jonathan:
> Since you implemented
> ,
> maybe you can point me to some document how bulk load is used ?
> I found bin/loadtable.rb and assume that can be used to import data
> into HBase.
> Thanks

This email is sent for and on behalf of Ivy Comptech Private Limited. Ivy Comptech Private
Limited is a limited liability company.  

This email and any attachments are confidential, and may be legally privileged and protected
by copyright. If you are not the intended recipient dissemination or copying of this email
is prohibited. If you have received this in error, please notify the sender by replying by
email and then delete the email completely from your system. 
Any views or opinions are solely those of the sender.  This communication is not intended
to form a binding contract on behalf of Ivy Comptech Private Limited unless expressly indicated
to the contrary and properly authorised. Any actions taken on the basis of this email are
at the recipient's own risk.

Registered office:
Ivy Comptech Private Limited, Cyber Spazio, Road No. 2, Banjara Hills, Hyderabad 500 033,
Andhra Pradesh, India. Registered number: 37994. Registered in India. A list of members' names
is available for inspection at the registered office.

View raw message