hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gaurav Vashishth <vashgau...@gmail.com>
Subject Re: HBase Insert Performance
Date Fri, 12 Feb 2010 12:25:30 GMT

Ryan, 

I have setup the custer as suggested by you. Now I have Master,namemode and
zookeeper on same machine and have 8 region servers running as data nodes
and with this configuration I was able to get the insertion speed of around
18K records/sec. Though Im still using 4GB ram, will upgrade it also and I
hope adding more region servers will increase the insertion speed 

Thanks,

Gaurav


Ryan Rawson wrote:
> 
> Hey,
> 
> So there are 2 major problems here:
> - the setup is way off. There is no actual data duplication for
> example, you will put every write to 1 machine, which when it fails,
> so goes your data.
> - These machines don't have enough ram. They must have at least
> 1gb/core, ideally 2gb/core or more.  This means they should have 8 gb
> ram.  crucial.com
> 
> A better setup would be:
> - 1 "master" node, runs: hmaster, 1xzookeeper, namenode
> - 5 data/regionservers
> 
> The key here to performance is to spread your workload over more
> machines.  This is how clustered software works in a nutshell.  using
> only 1/3 of your machines for "regionservers" and 1/6th for data
> storage (datanode) is non-ideal.
> 
> You really need to up the ram.  I run:
> - dual quad i7s with hyper-threading, which gives 16 cores to the OS
> - 24 gb ram
> - 4 x 1tb disk
> 
> My small end machines are:
> - dual quad xeons, 8 cores to the OS
> - 16 gb ram
> - 2 x 1tb disk
> 
> For performance you really dont want to have less than 1-2gb ram per
> core. Without a lot of ram, you don't get effective disk caching. You
> can't run map-reduces on the same nodes, you may run into swap issues,
> etc.  4 gb ddr3 ram is about $150 usd.
> 
> But given a reasonable machine set, doing 50k inserts/sec sustained
> over long periods of time is totally doable. You will need more than 6
> machines though! Don't forget your spares, since you really want to be
> able to operate on N-{1,2} machines so failures don't cripple you.
> 
> 
> 
> On Mon, Jan 18, 2010 at 2:55 AM, Gaurav Vashishth <vashgaurav@gmail.com>
> wrote:
>>
>> Using 6 machines, 8 core with 4 GB Ram, right now for setting up the
>> scenario.
>>
>> 2 region servers
>> 1 ZooKeeper
>> 1 Data Node
>> 2 Name Node
>>
>>
>>
>> Ryan Rawson wrote:
>>>
>>> How many machines do you have? I'd try at least 20+ late model boxes.
>>>
>>> On Jan 18, 2010 2:14 AM, "Gaurav Vashishth" <vashgaurav@gmail.com>
>>> wrote:
>>>
>>>
>>> I need to store live data which is about 40-50K records /sec, evaluated
>>> MYSql
>>> and now trying  HBase.
>>>
>>> Just read in docstoc that HBase insert performance, for few 1000 rows
>>> and
>>> 10
>>> columns with 1 MB values, is 68ms/row. My scenario is similar, we need
>>> under
>>> 10k rows, 10-20 columns and which can have thousands of version with
>>> values
>>> not greater than 300 bytes. Initially, I thought HBase can solve the
>>> puprose
>>> but reading docstoc article have put doubt in my mind.
>>>
>>> Can we get 40-50k records/sec insertion speed in HBase?? Also, there
>>> would
>>> be thousand of users who will be reading teh database also, can HBase
>>> maintain that much of speed?
>>>
>>> Thanks
>>> Gaurav
>>> --
>>> View this message in context:
>>> http://old.nabble.com/HBase-Insert-Performance-tp27208387p27208387.html
>>> Sent from the HBase User mailing list archive at Nabble.com.
>>>
>>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/HBase-Insert-Performance-tp27208387p27208828.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: http://old.nabble.com/HBase-Insert-Performance-tp27208387p27562803.html
Sent from the HBase User mailing list archive at Nabble.com.


Mime
View raw message