hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/PoweredBy" by danharvey
Date Tue, 04 Jan 2011 11:39:22 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hbase/PoweredBy" page has been changed by danharvey.


  [[http://www.mahalo.com|Mahalo]], "...the world's first human-powered search engine". All
the markup that powers the wiki is stored in HBase. It's been in use for a few months now.
!MediaWiki - the same software that power Wikipedia - has version/revision control. Mahalo's
in-house editors produce a lot of revisions per day, which was not working well in a RDBMS.
An hbase-based solution for this was built and tested, and the data migrated out of MySQL
and into HBase. Right now it's at something like 6 million items in HBase. The upload tool
runs every hour from a shell script to back up that data, and on 6 nodes takes about 5-10
minutes to run - and does not slow down production at all.
  [[http://www.meetup.com|Meetup]] is on a mission to help the world’s people self-organize
into local groups.  We use Hadoop and HBase to power a site-wide, real-time activity feed
system for all of our members and groups.  Group activity is written directly to HBase, and
indexed per member, with the member's custom feed served directly from HBase for incoming
requests.  We're running HBase 0.20.0 on a 11 node cluster.
+ [[http://www.mendeley.com|Mendeley]] We are creating a platform for researchers to collaborate
and share their research online. HBase is helping us to create the worlds largest research
paper collection and is being used to store all our raw imported data. We use a lot of map
reduce jobs to process these papers into pages displayed on the site. We also use HBase with
Pig to do analytics and produce the article statistics shown on the web site. You can find
out more about how we use HBase in these slides [http://www.slideshare.net/danharvey/hbase-at-mendeley].
  [[http://ning.com|Ning]] uses HBase to store and serve the results of processing user events
and log files, which allows us to provide near-real time analytics and reporting. We use a
small cluster of commodity machines with 4 cores and 16GB of RAM per machine to handle all
our analytics and reporting needs.

View raw message