Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E99CD10A0C for ; Wed, 17 Jul 2013 18:00:50 +0000 (UTC) Received: (qmail 18111 invoked by uid 500); 17 Jul 2013 18:00:48 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 18017 invoked by uid 500); 17 Jul 2013 18:00:46 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 18007 invoked by uid 99); 17 Jul 2013 18:00:45 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Jul 2013 18:00:45 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of kulkarni.swarnim@gmail.com designates 209.85.214.180 as permitted sender) Received: from [209.85.214.180] (HELO mail-ob0-f180.google.com) (209.85.214.180) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Jul 2013 18:00:39 +0000 Received: by mail-ob0-f180.google.com with SMTP id eh20so2604126obb.25 for ; Wed, 17 Jul 2013 11:00:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=bZU27JTy131rRyG/gD3yXv4OUHGuJUhRV3oJQYOeBAc=; b=MFQbc38Nz8oHt22U1+7f6evVJ0NAR5uOkewyAG63nZvN0xMTDjflBylE3jL31vABZt mTgFj2Th29r7lJ7LeGFpC9SpjG6VUG1OvCO9Qd1hJl6rWv4zL5xJmdxiEMS+F19u6RHl mowHYzWwTEuJgxh+LuUCIDXP5OSeR4onFaVWLWBa3MLvKS78Mx1/YRqzMuyADA3VFzlk aikZI1UbX3qjWEXMCzLKDS6dxmiDKYyXrOYLMFg8Wt6Chz5Vg/lfrggdmP5FVgeSaQwo qidVu4JLNLyqTs1dr3LdnbFtOiihR+28wvGN5YQBLZhw+3MI/KLHkga1In3EwUyorsy6 h0bg== X-Received: by 10.60.131.104 with SMTP id ol8mr9829302oeb.6.1374084019104; Wed, 17 Jul 2013 11:00:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.76.160.106 with HTTP; Wed, 17 Jul 2013 10:59:59 -0700 (PDT) In-Reply-To: References: From: "kulkarni.swarnim@gmail.com" Date: Wed, 17 Jul 2013 12:59:59 -0500 Message-ID: Subject: Re: which approach is better To: user@hive.apache.org Content-Type: multipart/alternative; boundary=089e013cb85cd9b5cb04e1b8dd36 X-Virus-Checked: Checked by ClamAV on apache.org --089e013cb85cd9b5cb04e1b8dd36 Content-Type: text/plain; charset=ISO-8859-1 First of all, that might not be the right approach to choose the underlying storage. You should choose HDFS or HBase depending on whether the data is going to be used for batch processing or you need random access on top of it. HBase is just another layer on top of HDFS. So obviously the queries running on top of HBase are going to be less efficient. So if you can get away with using HDFS, I would say that is the best and simplest approach. On Wed, Jul 17, 2013 at 12:40 PM, Hamza Asad wrote: > Please let me knw which approach is better. Either i save my data directly > to HDFS and run hive (shark) queries over it OR store my data in HBASE, and > then query it.. as i want to ensure efficient data retrieval and data > remains safe and can easily recover if hadoop crashes. > > -- > *Muhammad Hamza Asad* > -- Swarnim --089e013cb85cd9b5cb04e1b8dd36 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
First of all, that might not be the right approach to choo= se the underlying storage. You should choose HDFS or HBase depending on whe= ther the data is going to be used for batch processing or you need random a= ccess on top of it. HBase is just another layer on top of HDFS. So obviousl= y the queries running on top of HBase are going to be less efficient. So if= you can get away with using HDFS, I would say that is the best and simples= t approach.


On Wed, Jul 1= 7, 2013 at 12:40 PM, Hamza Asad <hamza.asad13@gmail.com> wrote:
Please let me knw which app= roach is better. Either i save my data=20 directly to HDFS and run hive (shark) queries over it OR store my data=20 in HBASE, and then query it.. as i want to ensure efficient data=20 retrieval and data remains safe and can easily recover if hadoop=20 crashes.
--
Muhammad Hamza Asad



--
Swarnim
--089e013cb85cd9b5cb04e1b8dd36--