hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/PoweredBy" by StevenNoels
Date Mon, 21 Jun 2010 20:59:48 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hbase/PoweredBy" page has been changed by StevenNoels.
The comment on this change is: added Lily.


  [[http://gumgum.com|GumGum]] is an in-image ad network. We use HBase 0.20 on a 4-node Amazon
EC2 Large Instance (m1.large) cluster for both real-time data and analytics. Our production
cluster has been running since June 2010.
  [[http://www.kalooga.com|Kalooga]] is a discovery service for image galleries. We use Hadoop,
Hbase, Chukwa and Pig on a 20-node cluster for our crawling, analysis and events processing.
+ [[http://www.lilycms.org|Lily]] is an open source content repository backed by HBase and
SOLR from Outerthought - scalable content applications.
  [[http://www.mahalo.com|Mahalo]], "...the world's first human-powered search engine". All
the markup that powers the wiki is stored in HBase. It's been in use for a few months now.
!MediaWiki - the same software that power Wikipedia - has version/revision control. Mahalo's
in-house editors produce a lot of revisions per day, which was not working well in a RDBMS.
An hbase-based solution for this was built and tested, and the data migrated out of MySQL
and into HBase. Right now it's at something like 6 million items in HBase. The upload tool
runs every hour from a shell script to back up that data, and on 6 nodes takes about 5-10
minutes to run - and does not slow down production at all.

View raw message