hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rajgopalv <raja.f...@gmail.com>
Subject Re: Inserting Random Data into HBASE
Date Thu, 02 Dec 2010 12:29:06 GMT

@Mike : 
I am using the client side cache. I collect the puts in an arratylist and
put it together. using HTable.put(List l);

MR seems to be a good idea. 
I'm relatively new to HBase, haven't worked in a real world hbase cluster.
So to begin with, could u recommend me a size of a cluster. ( i'm thinking
of 5, should i have more ? I'll be using EC2 machines and EBS for storage..
Thats fine right?)  And replication factor 3 will be sufficient enough right

@ Alex Baranau. What is a good bufferSize ? I'm using the default.

@amit. Thanks man. But MR seems to be a better option right? 

rajgopalv wrote:
> Hi, 
> I have to test hbase as to how long it takes to store 100 Million Records.
> So i wrote a simple java code which 
> 1 : generates random key and 10 columns per key and random values for the
> 10 columns.
> 2 : I make a Put object out of these and store it in arrayList
> 3 : When arrayList's size reaches 5000 i do table.put(listOfPuts);
> 4 : repeat until i put 100 million records.
> And i run this java program as single threaded java program. 
> Am i doing it right? is there any other way of importing large data for
> testing.? [ for now i'm not considering BULK data import/loadtable.rb etc. 
> apart from this is there any other way ?] 

View this message in context: http://old.nabble.com/Inserting-Random-Data-into-HBASE-tp30349594p30357933.html
Sent from the HBase User mailing list archive at Nabble.com.

View raw message