Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A26A3995E for ; Mon, 14 May 2012 19:42:04 +0000 (UTC) Received: (qmail 30675 invoked by uid 500); 14 May 2012 19:42:02 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 30517 invoked by uid 500); 14 May 2012 19:42:02 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 30509 invoked by uid 99); 14 May 2012 19:42:02 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 May 2012 19:42:02 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of amansk@gmail.com designates 209.85.210.41 as permitted sender) Received: from [209.85.210.41] (HELO mail-pz0-f41.google.com) (209.85.210.41) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 May 2012 19:41:55 +0000 Received: by dakp5 with SMTP id p5so9163750dak.14 for ; Mon, 14 May 2012 12:41:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:message-id:in-reply-to:references:subject:x-mailer :mime-version:content-type; bh=fs/dhmqBOgdZ4JMvVCqenKUE1EHv+AR2qxGGc801Hho=; b=t8Y/9gPrAi8jcY6cY7Vyo7aG6wm7RXu/8SmhLnbD24mghiCy7JeyXDNebWqGRgYZkZ KE+3SWj1QJ4ezLsHE3YrSeW8LLbmvZ2FEFLrP8bAVTuQcY01vHBkzT8rxx3Wie73wp9L Kqov31wF2JDnjQdZTxFBjpqbg6sisunzY5NzlJYIib4GwI4GKpCPOhpT9qORLm9+IvEh HCXowl4Wk4q2QWzkdIa305RLUOpkvwSlF2o3KVqjgb1V+z99QAYotq7NJvfndKceFO/Z RaeVFHgdNWXp0+gWJ6WC64Ah63t2Vd59Xp8DF37AVHLmnQvvpe+KeQJl43qXwJ8Nh2HS hmVw== Received: by 10.68.223.234 with SMTP id qx10mr25719208pbc.154.1337024493782; Mon, 14 May 2012 12:41:33 -0700 (PDT) Received: from AK-MBP.local ([50.0.84.5]) by mx.google.com with ESMTPS id u5sm23086058pbu.76.2012.05.14.12.41.31 (version=SSLv3 cipher=OTHER); Mon, 14 May 2012 12:41:32 -0700 (PDT) Date: Mon, 14 May 2012 12:41:32 -0700 From: Amandeep Khurana To: user@hbase.apache.org Message-ID: In-Reply-To: References: Subject: Re: hbase as a primary store, or is it more for "2nd class" data? X-Mailer: sparrow 1.5 (build 1043.1) MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="4fb15fec_704e1dd5_32ce" --4fb15fec_704e1dd5_32ce Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-Disposition: inline HDFS is designed to not lose data if a few nodes fail. It holds multiple replicas of each block. Having said that - it also depends on the definition of "a few". Many companies are using HDFS as their central data store and it's proven at scale in production. It does not lose data arbitrarily, and neither does HBase. Have you come across a case where you experience data loss with either HDFS or HBase? We'd be curious to learn about it. On Monday, May 14, 2012 at 12:35 PM, Srikanth P. Shreenivas wrote: > Yes, agreed that data can be lost in any DB. However, isnt it more frequently seen in NoSql DBs. In case of Hbase, Is it not possible for underlying HDFS to lose data if nodes went down abrubtly few times. > > > Andrew Purtell wrote: > > > Any data store may lose data, as a generic statement, so maybe you had something more specific in mind? > > On May 13, 2012, at 9:21 PM, "Srikanth P. Shreenivas" wrote: > > > There is a possibility that you may lose data, and hence, I would not use it for first class data if data cannot be re-created. > > If you can derive data from secondary source and store data in HBase for performance gains, then, it is a viable use case. > > > > Regards, > > Srikanth > > > > -----Original Message----- > > From: S Ahmed [mailto:sahmed1020@gmail.com] > > Sent: Monday, May 14, 2012 7:52 AM > > To: user@hbase.apache.org (mailto:user@hbase.apache.org); Otis Gospodnetic > > Subject: Re: hbase as a primary store, or is it more for "2nd class" data? > > > > Otis, > > > > It kind of goes back to what I was saying earlier, if FB is using it for searching your inbox, or storing your chat messages or wall posts, I don't really think that is important (and really it isn't hehe) > > > > I was just making an observation and wanted to get a feel for what others think. Obviously ever tool has its purpose and domain, and I was curious as to what others have seen in production usage etc. > > > > (I do realize some use cases the data is very important like analytic data that usually correlates to advertising $$ etc.) > > > > On Sun, May 13, 2012 at 10:00 PM, Otis Gospodnetic < otis_gospodnetic@yahoo.com (mailto:otis_gospodnetic@yahoo.com)> wrote: > > > > > Hi Ahmed, > > > > > > At Sematext we have a few SaaS products that use HBase as the primary > > > data store. I hear Facebook uses HBase for some important stuff, too. > > > ;) So far we've survived. HBase does have rough edges, but also good > > > developers who are making it better every day. > > > > > > Otis > > > ---- > > > Performance Monitoring for Solr / ElasticSearch / HBase - > > > http://sematext.com/spm > > > > > > > > > > > > > ________________________________ > > > > From: S Ahmed > > > > To: user@hbase.apache.org (mailto:user@hbase.apache.org) > > > > Sent: Sunday, May 13, 2012 8:14 PM > > > > Subject: hbase as a primary store, or is it more for "2nd class" data? > > > > > > > > I'm interested to learn if people are using hbase as a primary store > > > > or is it more for "2nd class" type data. > > > > > > > > Pretend you have a CMS product, or eCommerce Saas application: > > > > > > > > What I mean by this is, I consider "primary store" to mean storing > > > > the actual content (say articles, or blog posts), category data, user > > > > information, or shopping cart order, product information. > > > > > > > > "2nd class" type data is data like metrics, analytics, log data, or > > > > say index data (data that can be re-built via the primary store). > > > > > > > > In general 2nd class data is data that if lost, it won't bring the > > > business > > > > to its knees. > > > > > > > > What do you guys think, am I right? > > > > > > > > i.e. if you are creating a Saas product, it wouldn't be advisible to > > > > build it using hbase (or it will be kind of bleeding edge architecture). > > > > > > > > > > > > > > > > ________________________________ > > > > http://www.mindtree.com/email/disclaimer.html --4fb15fec_704e1dd5_32ce--