hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@yahoo.com>
Subject RE: implementing join on two Hbase tables
Date Sat, 06 Dec 2008 09:26:21 GMT
What I would do is, as Jonathan mentions, run two
mappers that select the data you'd like and then writes
the selected records using a Hadoop output format to 
temporary files in DFS, then run a third join step
that combines the temporary files.

As an additional step I would suggest if you have 
common (sub)queries that require this kind of
processing, you may want to proactively run them to
materialize views into HBase tables set up for that
purpose. This is especially true if your application
can handle gaps of "freshness" of data less than or
equal to the frequency such jobs may be run at (and
complete within). 

Hope this helps,

   - Andy

> From: Jonathan Gray <jlist@streamy.com>
> Subject: RE: implementing join on two Hbase tables
> To: hbase-user@hadoop.apache.org
> Date: Friday, December 5, 2008, 10:34 AM
> My personal favorite is Cascading
> (http://www.cascading.org) by Chris
> Wensel.
> Hive and Pig are other projects that help with this, but
> they also do nothave HBase hooks yet (that I'm aware of).
> You might also consider something like Pigi
> (http://www.pigi-project.org),
> Otherwise, you'll need to write your own jobs. 
> You'd need probably three different MR jobs.  Two that
> Map from each of the HBase tables you're interested in.
> Then another job that would read from combined output of
> those two jobs and perform the join.  You might use the
> Map->Reduce sort step to perform the join if possible,
> depends on the details of what you want to do.
> > From: abhinit [mailto:abhinit.kumar@gmail.com]
> > Sent: Friday, December 05, 2008 2:32 AM
> > To: hbase-user@hadoop.apache.org
> > Subject: implementing join on two Hbase tables
> > 
> > I am trying to implement hash-join and nested join on
> > two Hbase tables.
> > However, I am stuck.
> > Thanks a lot
> > -Abhinit


View raw message