hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Production usage stats/experience
Date Sat, 10 Jul 2010 21:04:25 GMT
Funny you ask, because we are about to write a serie of blog posts
about our HBase usage, operational experiences, future plans, etc, for
the new official HBase
blog (hbaseblog.com).

Still, I shared some details inline (but obviously this will be detailed later).

J-D

On Sat, Jul 10, 2010 at 11:04 AM,  <vramanathan00@aol.com> wrote:
>  Could you please share some #s? like how many requests @ peak, data store size,
> # of nodes in cluster, etc (if u can reveal that is)?

Our production cluster on peak answers around 16 to 22k requests per
second (depends on the day). It's mostly atomic increments which we
use for real-time reporting. Our MapReduce cluster peaks at around
7-8M scanned rows per second during our nightly big table
scan/recompute. We must have around 35B rows in all our tables
together, but I need to run some counts to have the right number (our
2 biggest tables have just over 15B each).

We have a total of 5 cluster, 20 machines each, configured with 2 i7s,
24GB and 4x1TB.

>
> I'm also planning to use HBase for realtime web app. I would like to get some inputs
> on what to do if something goes wrong...
> ..In development, if i see any issues, I do kill -9/stop all & rm -rf disk. due to
time crunch
> ..(bad idea)..Obviously i can't do that in production..

Oh we had worst issues than that. What about an unresponsive root disk
that freezes your OS but not some processes?

> -> Have you ever run into data corruption? ..that you could not recover any data?

Nope, hurray for checksumming at the HDFS level.

> -> If there is outage & if if you have to restart servers, what order you restart
servers? (I presume
> namenode/datanode, followed by HMaster, HRegionServer, followed by zookeper, followed
by HBase client app)

Hadoop then ZooKeeper then HBase then the thrift servers (our client is in php).

> -> Is there anything that we must backup in the advent of outage? (or) let HDFS replication
do its magic?
> ..I'm ok with losing few days data ..but not all.

We do incremental backups every hour to a NFS share and another
cluster in another datacenter.

>
> thanks in advance
> venkatesh
>
>
>
>
>
>
>
> -----Original Message-----
> From: Jean-Daniel Cryans <jdcryans@apache.org>
> To: user@hbase.apache.org
> Sent: Sat, Jul 10, 2010 1:47 pm
> Subject: Re: real world usage, any web applications built using hbase?
>
>
> At stumbleupon, we have su.pr (url shortner / advertising platform)
>
> that's totally based on HBase and has been in production for more than
>
> a year. Also many other parts of our main product also rely on HBase.
>
>
>
> J-D
>
>
>
> On Sat, Jul 10, 2010 at 10:43 AM, S Ahmed <sahmed1020@gmail.com> wrote:
>
>> Its my impression that most people are using nosql solutions for things like
>
>> statistic logging etc.
>
>>
>
>> Has anyone build a web application purely in hbase? e.g. Say an application
>
>> like Blogger or Gmail or vBulletin type applications.
>
>>
>
>> Are these potential candidates for building ontop of a nosql data store?
>
>>
>
>
>
>

Mime
View raw message