Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 57655 invoked from network); 1 Mar 2011 11:29:36 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Mar 2011 11:29:36 -0000 Received: (qmail 12308 invoked by uid 500); 1 Mar 2011 11:29:33 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 10883 invoked by uid 500); 1 Mar 2011 11:29:28 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 10861 invoked by uid 99); 1 Mar 2011 11:29:27 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Mar 2011 11:29:27 +0000 X-ASF-Spam-Status: No, hits=3.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mp2893@gmail.com designates 209.85.210.176 as permitted sender) Received: from [209.85.210.176] (HELO mail-iy0-f176.google.com) (209.85.210.176) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Mar 2011 11:29:19 +0000 Received: by iyj12 with SMTP id 12so4965521iyj.35 for ; Tue, 01 Mar 2011 03:28:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=ZBo89/Pvy0YZeT4IkwW9ULcp1v6qRmVaPbV7fqvfdMg=; b=jSBKiomRzaZOT9EJM1IMGQ6tJHpoQ/WdUjNTc803d550bUYqrJWWJfCy/kKgXYi901 ybsxn8ExwL6LJSMAgMdd9051qFgwF8QO7iOIzitaKSHjkdV2frSk71qFdds4iixwZw0N clon3loW8U0eAMxNimhlsfpB0iePN8xuAkIIY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=he07NU9w8PK5efioB1DRfWmQx8kjtOUH6nHFXxuxNVUCtW5784yRzZe6N6YJtr4RJY h9LAqezryGk1mTrX2o7dqbgwICmxWQxJ0REKZ2D2JQ2GIApH11LaOJ+57m6b4R8qDi0Y yWm/x0vKHsnwt5RJfls0tA/aWgFcTRrgZ8Umw= MIME-Version: 1.0 Received: by 10.231.19.136 with SMTP id a8mr6382105ibb.73.1298978938698; Tue, 01 Mar 2011 03:28:58 -0800 (PST) Received: by 10.231.251.84 with HTTP; Tue, 1 Mar 2011 03:28:58 -0800 (PST) Date: Tue, 1 Mar 2011 20:28:58 +0900 Message-ID: Subject: Inserting data directly into HBase? From: edward choi To: common-user@hadoop.apache.org, user@hbase.apache.org Content-Type: multipart/alternative; boundary=00221534d727363bdf049d6a1a9f X-Virus-Checked: Checked by ClamAV on apache.org --00221534d727363bdf049d6a1a9f Content-Type: text/plain; charset=ISO-8859-1 Hi, I am trying to crawl several thousands of rss feeds every 30 minutes. I thought I could use Hadoop and HBase as my platform. However, I am not familiar with the HBase architecture and was wondering if I could insert crawled news articles directly into HBase without first saving it into HDFS. I am asking this dumb question because all the HBase examples I saw in reference books are always starting with saving data to HDFS. And also, If I have 2 computers comprised of A for HDFS, and B for HBase, what happens when I insert data directly into HBase? Is the data stored in B automatically and a pointer is made to A? Or is the data stored in A and a pointer is made to itself? I really have no idea how HBase operates :( --00221534d727363bdf049d6a1a9f--