hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Trivial Update of "Hbase/PoweredBy" by JeanDanielCryans
Date Fri, 29 May 2009 12:50:23 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by JeanDanielCryans:
http://wiki.apache.org/hadoop/Hbase/PoweredBy

------------------------------------------------------------------------------
  
  [http://www.yahoo.com/ Yahoo!] uses HBase to store document fingerprint for detecting near-duplications.
We have a cluster of few nodes that runs HDFS, mapreduce, and HBase. The table contains millions
of rows. We use this for querying duplicated documents with realtime traffic.
  
- [http://www.openplaces.org Openplaces] is a search engine for travel that uses HBase to
store terabytes of web pages and travel-related entity records (countries, cities, hotels,
etc.). We have dozens of MapReduce jobs that crunch data on a daily basis.  We use a 20-node
cluster for development, a 10-node cluster (currently being scaled) for offline production
processing and an EC2 cluster for the live web site. 
+ [http://www.openplaces.org Openplaces] is a search engine for travel that uses HBase to
store terabytes of web pages and travel-related entity records (countries, cities, hotels,
etc.). We have dozens of MapReduce jobs that crunch data on a daily basis.  We use a 20-node
cluster for development, a 40-node cluster for offline production processing and an EC2 cluster
for the live web site. 
  

Mime
View raw message