hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Rutherglen <jason.rutherg...@gmail.com>
Subject Re: How to efficiently join HBase tables?
Date Tue, 31 May 2011 18:48:02 GMT
Doesn't Hive for HBase enable joins?

On Tue, May 31, 2011 at 5:06 AM, Eran Kutner <eran@gigya.com> wrote:
> Hi,
> I need to join two HBase tables. The obvious way is to use a M/R job for
> that. The problem is that the few references to that question I found
> recommend pulling one table to the mapper and then do a lookup for the
> referred row in the second table.
> This sounds like a very inefficient way to do  join with map reduce. I
> believe it would be much better to feed the rows of both tables to the
> mapper and let it emit a key based on the join fields. Since all the rows
> with the same join fields values will have the same key the reducer will be
> able to easily generate the result of the join.
> The problem with this is that I couldn't find a way to feed two tables to a
> single map reduce job. I could probably dump the tables to files in a single
> directory and then run the join on the files but that really makes no sense.
> Am I missing something? Any other ideas?
> -eran

View raw message