hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/PoweredBy" by JeanDanielCryans
Date Wed, 11 Mar 2009 18:48:47 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by JeanDanielCryans:

The comment on the change is:
Added openplaces

  [http://www.yahoo.com/ Yahoo!] uses HBase to store document fingerprint for detecting near-duplications.
We have a cluster of few nodes that runs HDFS, mapreduce, and HBase. The table contains millions
of rows. We use this for querying duplicated documents with realtime traffic.
+ [http://www.openplaces.org Openplaces] is a search engine for travel that uses HBase to
store terabytes of web pages and travel-related entity records (countries, cities, hotels,
etc.). We have dozens of MapReduce jobs that crunch data on a daily basis.  We use a 20-node
cluster for development, a 10-node cluster (currently being scaled) for offline production
processing and an EC2 cluster for the live web site. 

View raw message