Return-Path: Delivered-To: apmail-hive-user-archive@www.apache.org Received: (qmail 45720 invoked from network); 13 Oct 2010 20:45:59 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 13 Oct 2010 20:45:59 -0000 Received: (qmail 87442 invoked by uid 500); 13 Oct 2010 20:45:58 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 87413 invoked by uid 500); 13 Oct 2010 20:45:58 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 87405 invoked by uid 99); 13 Oct 2010 20:45:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Oct 2010 20:45:58 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of timrobertson100@gmail.com designates 209.85.215.48 as permitted sender) Received: from [209.85.215.48] (HELO mail-ew0-f48.google.com) (209.85.215.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Oct 2010 20:45:51 +0000 Received: by ewy28 with SMTP id 28so2403106ewy.35 for ; Wed, 13 Oct 2010 13:45:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=WrNpon9Fq4Az7aooJsmFaG4TJffq+AtG46nQbjX+PRk=; b=gaqgDRuopSOkzybmz6KHSDo/iJ2OYpkWB0wwmMG+o2QEg1RiXQpKbdzyrbaZX7/5tt QPepecpvwV9yCNKxL412T2dXFTTCR5jMY5jvNk3RW9V7k8aXeZQxtp03EkhSjSXso6yv RkR3V/CRLMmrFk1xt8/Wem8kdKa1FmG7GRKlE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=kr2e+1ZKaJd3Ohuk7DqQ2NCO8uHzq2lVBxZ/gw8HIAs2aLaFv454bL8zdBvlgPU+eK 1IY809vqnyxN8I4lERO5Uz0sUt+0NqLBZDjF0AyZlaWVqqt054z6Bs0pDBBs5C+Rwvdj sD+oqTYfm2/onwB3XcGSVCosMlZgHLKz01Gcs= MIME-Version: 1.0 Received: by 10.213.31.203 with SMTP id z11mr350588ebc.71.1287002731421; Wed, 13 Oct 2010 13:45:31 -0700 (PDT) Received: by 10.14.127.137 with HTTP; Wed, 13 Oct 2010 13:45:31 -0700 (PDT) In-Reply-To: <341537.7552.qm@web50308.mail.re2.yahoo.com> References: <341537.7552.qm@web50308.mail.re2.yahoo.com> Date: Wed, 13 Oct 2010 22:45:31 +0200 Message-ID: Subject: Re: HBase as input AND output? From: Tim Robertson To: user@hive.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org That's right. Hive can use an HBase table as an input format to the hive query regardless of output format, and can also write the output to an HBase table regardless of the input format. You can also supposedly do a join in Hive that uses 1 side of the join from an HBase table, and the other side a text file, which is very powerful. I haven't done it myself, but intend to shortly. HTH, Tim On Wed, Oct 13, 2010 at 10:07 PM, Otis Gospodnetic wrote: > Hi, > > I was wondering how I can query data stored in HBase and remembered Hive'= s HBase > integration: > http://wiki.apache.org/hadoop/Hive/HBaseIntegration > > After watching John Sichi's video > (http://developer.yahoo.com/blogs/hadoop/posts/2010/04/hundreds_of_hadoop= _fans_at_the/ > =A0) I have a better idea about what functionality this integration provi= des, but > I still have some questions. > > Would it be correct to say that Hive-HBase integration makes the followin= g data > flow possible: > > 0) Hive or Files =3D> Custom HQL statement that aggregates data =A0=3D=3D= > HBase > 1) HBase =3D=3D> Custom HQL statement that aggregates data =A0=3D=3D> HBa= se > 2) HBase =3D=3D> Custom HQL statement that aggregates data =A0=3D=3D> out= put (console?) > > Of the above, 1) is what I'm wondering the most about right now. > > In other words, it seems to me that Hive may be able to look at *just* da= ta > stored in HBase *without* the typical data/files in HDFS that Hive normal= ly runs > its MR jobs against. > > Is this correct? > > Thanks, > Otis > ---- > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Hadoop ecosystem search :: http://search-hadoop.com/ > >