hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From iain wright <iainw...@gmail.com>
Subject Re: What companies are using HBase to serve a customer-facing product?
Date Fri, 05 Dec 2014 22:15:36 GMT
Hi Jeremy,

pinterest is using it for their feeds:
http://www.slideshare.net/cloudera/case-studies-session-3a
http://www.slideshare.net/cloudera/operations-session-1

Not sure on their dataset size, they are doing cluster level replication
for DR. We based our architecture on their success (cluster in each
az,  multi master replication between them for DR, flume & api's watch
zookeeper znodes for which cluster to talk too-- talk to one cluster at a
time and we control flips between them for maintenance/DR). Our use case is
retrieving social data ingested from twitter/fb/etc. when customer facing
applications hit our social api.

In terms of team size there are many variables
- If you are running your own metal there would be more work around
networking/rack+stack+cabling/provisioning os/etc. unless this is provided
by another dept already
- Do you have an hbase expert or DBA in house already? Or are your
developers going to take on learning schema design and tuning the cluster?
- Do you have sysadmins/devops available to write puppet/chef/ansible for
provisioning this cluster (and dev/qa enviornments) and performing
upgrades/etc. moving forward?
- Do you have a NOC & monitoring already in place for other pieces of infra
that will take on monitoring cluster health and responding to alerts/failed
disk/regionservers/etc.

You may want to check out previous hbasecon and hadoop summit videos, lots
of presentations will talk about or at least mention their dataset size and
use case:
- https://www.youtube.com/user/HadoopSummit
- http://hbasecon.com/archive.html

All the best,

-- 
Iain Wright

This email message is confidential, intended only for the recipient(s)
named above and may contain information that is privileged, exempt from
disclosure under applicable law. If you are not the intended recipient, do
not disclose or disseminate the message to anyone except the intended
recipient. If you have received this message in error, or are not the named
recipient(s), please immediately notify the sender by return email, and
delete all copies of this message.

On Fri, Dec 5, 2014 at 1:37 PM, jeremy p <athomewithagroovebox@gmail.com>
wrote:

> Hey all,
>
> So, I'm currently evaluating HBase as a solution for querying a very large
> data set (think 60+ TB). We'd like to use it to directly power a
> customer-facing product. My question is threefold :
>
> 1) What companies use HBase to serve a customer-facing product? I'm not
> interested in evaluations, experiments, or POC.  I'm also not interested in
> offline BI or analytics.  I'm specifically interested in cases where HBase
> serves as the data store for a customer-facing product.
>
> 2) Of the companies that use HBase to serve a customer-facing product,
> which ones use it to query data sets of 60TB or more?
>
> 3) Of companies use HBase to query 60+ TB data sets and serve a
> customer-facing product, how many employees are required to support their
> HBase installation?  In other words, if I were to start a team tomorrow,
> and their purpose was to maintain a 60+ TB HBase installation for a
> customer-facing product, how many people should I hire?
>
> 4) Of companies use HBase to query 60+ TB data sets and serve a
> customer-facing product, what kind of measures do they take for disaster
> recovery?
>
> If you can, please point me to articles, videos, and other materials.
> Obviously, the larger the company, the better case it will make for HBase.
>
> Thank you!
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message