Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7DE6F64D2 for ; Wed, 1 Jun 2011 13:54:39 +0000 (UTC) Received: (qmail 89653 invoked by uid 500); 1 Jun 2011 13:54:38 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 89627 invoked by uid 500); 1 Jun 2011 13:54:38 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 89619 invoked by uid 99); 1 Jun 2011 13:54:38 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Jun 2011 13:54:38 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lars.george@gmail.com designates 209.85.214.41 as permitted sender) Received: from [209.85.214.41] (HELO mail-bw0-f41.google.com) (209.85.214.41) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Jun 2011 13:54:32 +0000 Received: by bwz17 with SMTP id 17so100994bwz.14 for ; Wed, 01 Jun 2011 06:54:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type:content-transfer-encoding; bh=f9nZRPchbdFBofssTbUOkSOFOASXsV9zpJYWXzYYejY=; b=o3pEgPQpj/fY0J5w4Ky1u/L91GoccHGL2Tlexb6zvSExd69JGYvKZbDIUI4DXJt2m9 WHo5b6Nsbff/nieaJt2djsue1qmA73jFjw7jFseHsEz+r39R05IM2Eo/MT83oMV0ZwnX cb9Z3OvMUskU1z6GRHJUKmCMX6hGTS4bpijjs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=NFNG2wt7J7m/9YG6Wx0DWsAGbczUufQucWPBnbRPgV2DYes5p83aQ5BRQVBGU4Gxeb nIZtuL1a9hZMWFIRpG4ApY5QCBpRcWnVyOtMhxAMEL/urV9Cm1g1Mner1zTXhbb/X475 oJydFPED6fs4RfOgH5zT7e06aQ61qhYC6u7Bs= MIME-Version: 1.0 Received: by 10.204.36.7 with SMTP id r7mr321429bkd.2.1306936451597; Wed, 01 Jun 2011 06:54:11 -0700 (PDT) Received: by 10.204.100.5 with HTTP; Wed, 1 Jun 2011 06:54:11 -0700 (PDT) In-Reply-To: References: <67680900F79B1D4F99C844EE386FC5952823B100A5@EX2K7VS03.4emm.local> <67680900F79B1D4F99C844EE386FC5952823B100C4@EX2K7VS03.4emm.local> Date: Wed, 1 Jun 2011 15:54:11 +0200 Message-ID: Subject: Re: How to efficiently join HBase tables? From: Lars George To: user@hbase.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Hi Jason, This was discussed in the past, using the HFileInputFormat. The issue is that you somehow need to flush all in-memory data *and* perform a major compaction - or else you would need all the logic of the ColumnTracker in the HFIF. Since that needs to scan all storage files in parallel to achieve its job, the MR task would not really be able to use the same approach. Running a major compaction creates a lot of churn, so it is questionable what the outcome is. The files do have a flag if they were made by a major compaction, so you scan only those and ignore the newer ones - but then you are trailing, and you still do not handle delete markers/updates in newer files. No easy feat. Lars On Wed, Jun 1, 2011 at 2:41 AM, Jason Rutherglen wrote: >> I'd imagine that join operations do not require realtime-ness, and so >> faster batch jobs using Hive -> frozen HBase files in HDFS could be >> the optimal way to go? > > In addition to lessening the load on the perhaps live RegionServer. > There's no Jira for this, I'm tempted to open one. > > On Tue, May 31, 2011 at 5:18 PM, Jason Rutherglen > wrote: >>> The Hive-HBase integration allows you to create Hive tables that are ba= cked >>> by HBase >> >> In addition, HBase can be made to go faster for MapReduce jobs, if the >> HFile's could be used directly in HDFS, rather than proxying through >> the RegionServer. >> >> I'd imagine that join operations do not require realtime-ness, and so >> faster batch jobs using Hive -> frozen HBase files in HDFS could be >> the optimal way to go? >> >> On Tue, May 31, 2011 at 1:41 PM, Patrick Angeles = wrote: >>> On Tue, May 31, 2011 at 3:19 PM, Eran Kutner wrote: >>> >>>> For my need I don't really need the general case, but even if I did I = think >>>> it can probably be done simpler. >>>> The main problem is getting the data from both tables into the same MR= job, >>>> without resorting to lookups. So without the theoretical >>>> MutliTableInputFormat, I could just copy all the data from both tables= into >>>> a temp table, just append the source table name to the row keys to mak= e >>>> sure >>>> there are no conflicts. When all the data from both tables is in the s= ame >>>> temp table, run a MR job. For each row the mapper should emit a key wh= ich >>>> is >>>> composed of all the values of the join fields in that row (the value c= an be >>>> emitted as is). This will cause all the rows from both tables, with sa= me >>>> join field values to arrive at the reducer together. The reducer could= then >>>> iterate over them and produce the Cartesian product as needed. >>>> >>>> I still don't like having to copy all the data into a temp table just >>>> because I can't feed two tables into the MR job. >>>> >>> >>> Loading the smaller table in memory is called a map join, versus a >>> reduce-side join (a.k.a. common join). One reason to prefer a map join = is >>> you avoid the shuffle phase which potentially involves several trips to= disk >>> for the intermediate records due to spills, and also once through the >>> network to get each intermediate KV pair to the right reducer. With a m= ap >>> join, everything is local, except for the part where you load the small >>> table. >>> >>> >>>> >>>> As Jason Rutherglen mentioned above, Hive can do joins. I don't know i= f it >>>> can do them for HBase and it will not suit my needs, but it would be >>>> interesting to know how is it doing them, if anyone knows. >>>> >>> >>> The Hive-HBase integration allows you to create Hive tables that are ba= cked >>> by HBase. You can do joins on those tables (and also with standard Hive >>> tables). It might be worth trying out in your case as it lets you easil= y see >>> the load characteristics and the job runtime without much coding invest= ment. >>> >>> There are probably some specific optimizations that can be applied to y= our >>> situation, but it's hard to say without knowing your use-case. >>> >>> Regards, >>> >>> - Patrick >>> >>> >>>> -eran >>>> >>>> >>>> >>>> On Tue, May 31, 2011 at 22:02, Ted Dunning wro= te: >>>> >>>> > The Cartesian product often makes an honest-to-god join not such a g= ood >>>> > idea >>>> > on large data. =A0The common alternative is co-group >>>> > which is basically like doing the hard work of the join, but involve= s >>>> > stopping just before emitting the cartesian product. =A0This allows >>>> > you to inject whatever cleverness you need at this point. >>>> > >>>> > Common kinds of cleverness include down-sampling of problematically = large >>>> > sets of candidates. >>>> > >>>> > On Tue, May 31, 2011 at 11:56 AM, Michael Segel >>>> > wrote: >>>> > >>>> > > So the underlying problem that the OP was trying to solve was how = to >>>> join >>>> > > two tables from HBase. >>>> > > Unfortunately I goofed. >>>> > > I gave a quick and dirty solution that is a bit incomplete. They r= ow >>>> key >>>> > in >>>> > > the temp table has to be unique and I forgot about the Cartesian >>>> > > product. So my solution wouldn't work in the general case. >>>> > > >>>> > >>>> >>> >> >