hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vramanatha...@aol.com
Subject Re: Production usage stats/experience
Date Sat, 10 Jul 2010 21:42:53 GMT
J-D
Thanks a heap. This is very enlightening..Look forward to the blog posts.
Venkatesh

 

 


 

 

-----Original Message-----
From: Jean-Daniel Cryans <jdcryans@apache.org>
To: user@hbase.apache.org
Sent: Sat, Jul 10, 2010 5:04 pm
Subject: Re: Production usage stats/experience


Funny you ask, because we are about to write a serie of blog posts

about our HBase usage, operational experiences, future plans, etc, for

the new official HBase

blog (hbaseblog.com).



Still, I shared some details inline (but obviously this will be detailed later).



J-D



On Sat, Jul 10, 2010 at 11:04 AM,  <vramanathan00@aol.com> wrote:

>  Could you please share some #s? like how many requests @ peak, data store 

size,

> # of nodes in cluster, etc (if u can reveal that is)?



Our production cluster on peak answers around 16 to 22k requests per

second (depends on the day). It's mostly atomic increments which we

use for real-time reporting. Our MapReduce cluster peaks at around

7-8M scanned rows per second during our nightly big table

scan/recompute. We must have around 35B rows in all our tables

together, but I need to run some counts to have the right number (our

2 biggest tables have just over 15B each).



We have a total of 5 cluster, 20 machines each, configured with 2 i7s,

24GB and 4x1TB.



>

> I'm also planning to use HBase for realtime web app. I would like to get some 

inputs

> on what to do if something goes wrong...

> ..In development, if i see any issues, I do kill -9/stop all & rm -rf disk. 

due to time crunch

> ..(bad idea)..Obviously i can't do that in production..



Oh we had worst issues than that. What about an unresponsive root disk

that freezes your OS but not some processes?



> -> Have you ever run into data corruption? ..that you could not recover any 

data?



Nope, hurray for checksumming at the HDFS level.



> -> If there is outage & if if you have to restart servers, what order you 

restart servers? (I presume

> namenode/datanode, followed by HMaster, HRegionServer, followed by zookeper, 

followed by HBase client app)



Hadoop then ZooKeeper then HBase then the thrift servers (our client is in php).



> -> Is there anything that we must backup in the advent of outage? (or) let 

HDFS replication do its magic?

> ..I'm ok with losing few days data ..but not all.



We do incremental backups every hour to a NFS share and another

cluster in another datacenter.



>

> thanks in advance

> venkatesh

>

>

>

>

>

>

>

> -----Original Message-----

> From: Jean-Daniel Cryans <jdcryans@apache.org>

> To: user@hbase.apache.org

> Sent: Sat, Jul 10, 2010 1:47 pm

> Subject: Re: real world usage, any web applications built using hbase?

>

>

> At stumbleupon, we have su.pr (url shortner / advertising platform)

>

> that's totally based on HBase and has been in production for more than

>

> a year. Also many other parts of our main product also rely on HBase.

>

>

>

> J-D

>

>

>

> On Sat, Jul 10, 2010 at 10:43 AM, S Ahmed <sahmed1020@gmail.com> wrote:

>

>> Its my impression that most people are using nosql solutions for things like

>

>> statistic logging etc.

>

>>

>

>> Has anyone build a web application purely in hbase? e.g. Say an application

>

>> like Blogger or Gmail or vBulletin type applications.

>

>>

>

>> Are these potential candidates for building ontop of a nosql data store?

>

>>

>

>

>

>


 

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message