Return-Path: Delivered-To: apmail-hadoop-common-commits-archive@www.apache.org Received: (qmail 86491 invoked from network); 25 Aug 2009 14:35:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 25 Aug 2009 14:35:39 -0000 Received: (qmail 92366 invoked by uid 500); 25 Aug 2009 14:36:04 -0000 Delivered-To: apmail-hadoop-common-commits-archive@hadoop.apache.org Received: (qmail 92294 invoked by uid 500); 25 Aug 2009 14:36:03 -0000 Mailing-List: contact common-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-commits@hadoop.apache.org Received: (qmail 92285 invoked by uid 500); 25 Aug 2009 14:36:03 -0000 Delivered-To: apmail-hadoop-core-commits@hadoop.apache.org Received: (qmail 92282 invoked by uid 99); 25 Aug 2009 14:36:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Aug 2009 14:36:03 +0000 X-ASF-Spam-Status: No, hits=-1996.5 required=10.0 tests=ALL_TRUSTED,URIBL_BLACK X-Spam-Check-By: apache.org Received: from [140.211.11.130] (HELO eos.apache.org) (140.211.11.130) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Aug 2009 14:35:54 +0000 Received: from eos.apache.org (localhost [127.0.0.1]) by eos.apache.org (Postfix) with ESMTP id 7D254118C1 for ; Tue, 25 Aug 2009 14:35:34 +0000 (GMT) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Apache Wiki To: core-commits@hadoop.apache.org Date: Tue, 25 Aug 2009 14:35:34 -0000 Message-ID: <20090825143534.3004.42286@eos.apache.org> Subject: [Hadoop Wiki] Update of "Hbase/PoweredBy" by LarsGeorge X-Virus-Checked: Checked by ClamAV on apache.org Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification. The following page has been changed by LarsGeorge: http://wiki.apache.org/hadoop/Hbase/PoweredBy ------------------------------------------------------------------------------ [http://www.videosurf.com/ VideoSurf] - "The video search engine that has taught computers to see". We're using Hbase to persist various large graphs of data and other statistics. Hbase was a real win for us because it let us store substantially larger datasets without the need for manually partitioning the data and it's column-oriented nature allowed us to create schemas that were substantially more efficient for storing and retrieving data. - [http://www.worldlingo.com/ WorldLingo] - The !WorldLingo Multilingual Archive. We use HBase to store millions of documents that we scan using Map/Reduce jobs to machine translate them into all or selected target languages from our set of available machine translation languages. We currently store 12 million documents but plan to eventually reach the 450 million mark. HBase allows us to scale out as we need to grow our storage capacities. Combined with Hadoop to keep the data replicated and therefore fail-safe we have the backbone our service can rely on now and in the future. WorldLingo is using HBase since December 2007 and is along with a few others one of the longest running HBase installation. Currently we are running the latest HBase 0.20 and serving directly from it: [http://www.worldlingo.com/ma/enwiki/en/HBase MultilingualArchive]. + [http://www.worldlingo.com/ WorldLingo] - The !WorldLingo Multilingual Archive. We use HBase to store millions of documents that we scan using Map/Reduce jobs to machine translate them into all or selected target languages from our set of available machine translation languages. We currently store 12 million documents but plan to eventually reach the 450 million mark. HBase allows us to scale out as we need to grow our storage capacities. Combined with Hadoop to keep the data replicated and therefore fail-safe we have the backbone our service can rely on now and in the future. !WorldLingo is using HBase since December 2007 and is along with a few others one of the longest running HBase installation. Currently we are running the latest HBase 0.20 and serving directly from it: [http://www.worldlingo.com/ma/enwiki/en/HBase MultilingualArchive]. [http://www.yahoo.com/ Yahoo!] uses HBase to store document fingerprint for detecting near-duplications. We have a cluster of few nodes that runs HDFS, mapreduce, and HBase. The table contains millions of rows. We use this for querying duplicated documents with realtime traffic.