hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/PoweredBy" by KenWeiner
Date Tue, 15 Dec 2009 19:29:32 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hbase/PoweredBy" page has been changed by KenWeiner.
The comment on this change is: Changed GumGum to BEDROCK (recently spun out of GumGum).


  [[http://www.adobe.com|Adobe]] - We currently have about 30 nodes running HDFS, Hadoop and
HBase  in clusters ranging from 5 to 14 nodes on both production and development. We plan
a deployment on an 80 nodes cluster. We are using HBase in several areas from social services
to structured data and processing for internal use. We constantly write data to HBase and
run mapreduce jobs to process then store it back to HBase or external systems. Our production
cluster has been running since Oct 2008.
+ [[http://www.bedrock.com|BEDROCK]] is a monetization platform for building the next generation
of ad products. We use HBase 0.20.0 on a 4-node Amazon EC2 Large Instance (m1.large) cluster
for both real-time data and analytics. Our production cluster has been running since July
  [[http://www.flurry.com|Flurry]] provides mobile application analytics.  We use HBase and
Hadoop for all of our analytics processing, and serve all of our live requests directly out
of HBase on our 16-node production cluster with billions of rows over several tables.
  [[http://www.drawntoscaleconsulting.com|Drawn to Scale Consulting]] consults on HBase, Hadoop,
Distributed Search, and Scalable architectures.
- [[http://gumgum.com|GumGum]] is an analytics and monetization platform for online content.
We've developed usage-based licensing models that make the best content in the world accessible
to publishers of all sizes.  We use HBase 0.20.0 on a 4-node Amazon EC2 cluster to record
visits to advertisers in our ad network. Our production cluster has been running since July
  [[http://www.mahalo.com|Mahalo]], "...the world's first human-powered search engine". All
the markup that powers the wiki is stored in HBase. It's been in use for a few months now.
!MediaWiki - the same software that power Wikipedia - has version/revision control. Mahalo's
in-house editors produce a lot of revisions per day, which was not working well in a RDBMS.
An hbase-based solution for this was built and tested, and the data migrated out of MySQL
and into HBase. Right now it's at something like 6 million items in HBase. The upload tool
runs every hour from a shell script to back up that data, and on 6 nodes takes about 5-10
minutes to run - and does not slow down production at all.

View raw message