hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/PoweredBy" by BradfordStephens
Date Wed, 07 Oct 2009 20:59:16 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hbase/PoweredBy" page has been changed by BradfordStephens:

  [[http://www.adobe.com|Adobe]] - We currently have about 30 nodes running HDFS, Hadoop and
HBase  in clusters ranging from 5 to 14 nodes on both production and development. We plan
a deployment on an 80 nodes cluster. We are using HBase in several areas from social services
to structured data and processing for internal use. We constantly write data to HBase and
run mapreduce jobs to process then store it back to HBase or external systems. Our production
cluster has been running since Oct 2008.
  [[http://www.flurry.com|Flurry]] provides mobile application analytics.  We use HBase and
Hadoop for all of our analytics processing, and serve all of our live requests directly out
of HBase on our production cluster with billions of rows over several tables.
+ [[http://www.drawntoscaleconsulting.com|Drawn to Scale Consulting]] consults on HBase, Hadoop,
Distributed Search, and Scalable architectures. 
  [[http://gumgum.com|GumGum]] is an analytics and monetization platform for online content.
We've developed usage-based licensing models that make the best content in the world accessible
to publishers of all sizes.  We use HBase 0.20.0 on a 4-node Amazon EC2 cluster to record
visits to advertisers in our ad network. Our production cluster has been running since July
@@ -30, +32 @@

  [[http://www.videosurf.com/|VideoSurf]] - "The video search engine that has taught computers
to see". We're using Hbase to persist various large graphs of data and other statistics. Hbase
was a real win for us because it let us store substantially larger datasets without the need
for manually partitioning the data and it's column-oriented nature allowed us to create schemas
that were substantially more efficient for storing and retrieving data.
+ [[http://www.visibletechnologies.com/|Visible Technologies]] - We use Hadoop, HBase, Katta,
and more to collect, parse, store, and search hundreds of millions of Social Media content.
We get incredibly fast throughput and very low latency on commodity hardware. HBase enables
our business to exist. 
  [[http://www.worldlingo.com/|WorldLingo]] - The !WorldLingo Multilingual Archive. We use
HBase to store millions of documents that we scan using Map/Reduce jobs to machine translate
them into all or selected target languages from our set of available machine translation languages.
We currently store 12 million documents but plan to eventually reach the 450 million mark.
HBase allows us to scale out as we need to grow our storage capacities. Combined with Hadoop
to keep the data replicated and therefore fail-safe we have the backbone our service can rely
on now and in the future. !WorldLingo is using HBase since December 2007 and is along with
a few others one of the longest running HBase installation. Currently we are running the latest
HBase 0.20 and serving directly from it: [[http://www.worldlingo.com/ma/enwiki/en/HBase|MultilingualArchive]].
  [[http://www.yahoo.com/|Yahoo!]] uses HBase to store document fingerprint for detecting
near-duplications. We have a cluster of few nodes that runs HDFS, mapreduce, and HBase. The
table contains millions of rows. We use this for querying duplicated documents with realtime

View raw message