hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Robertson <timrobertson...@gmail.com>
Subject Re: How to join tables in HBase 20.3
Date Fri, 19 Mar 2010 16:29:12 GMT
If your joining needs are for offline reporting (e.g. not real time search)
then you can join by using MapReduce but they are long running jobs.
I am using Hive which gives you SQL, but compiles the SQL to mapreduce jobs.
 I am running on Tab files, but I read Hive now has HBase input formats,
meaning you can join HBase tables.  It will not be a fast query though, but
will meet long running join needs (e.g. reports etc)

Tim


On Fri, Mar 19, 2010 at 5:03 PM, Jonathan Gray <jgray@facebook.com> wrote:

> At some point joins may be necessary when denormalization is not possible.
>
> There is no built-in mechanism to do it.  It would be a series of
> additional Get calls to the second table you are joining against.  This
> would be helped significantly with a parallel MultiGet which will hopefully
> make it to 0.21.
>
> JG
>
> > -----Original Message-----
> > From: TuX RaceR [mailto:tuxracer69@gmail.com]
> > Sent: Friday, March 19, 2010 8:41 AM
> > To: hbase-user@hadoop.apache.org
> > Subject: Re: How to join tables in HBase 20.3
> >
> > Hi Raffi,
> >
> > when dealing with key-value stores, you need to think in a different
> > way
> > see for instance:
> >
> > http://wiki.apache.org/hadoop/Hbase/DataModel
> >
> > "Getting high scalability from your relational database isn't done by
> > simply adding more machines because its data model is based on a
> > single-machine architecture. For example, a JOIN between two tables is
> > done in memory and does not take into account the possibility that the
> > data has to go over the wire."
> >
> > JOIN simply does not scale in relational databases.
> >
> >
> > see also
> >
> > http://wiki.apache.org/hadoop/Hbase/FAQ#A20
> >
> > *20 Are there any Schema Design examples?*
> >
> >
> > Hope this helps,
> >
> > Cheers
> > TuX
> >
> >
> > Basmajian, Raffi wrote:
> > > I am new to HBase and come from a rdbms background. After looking in
> > the
> > > sample client code it seems fairly easy to query a single table using
> > > Get and Scan, but it's not so obvious how to join data across
> > multiple
> > > tables.
> > >
> > > Are there any examples on how to read/join data across multiple
> > tables?
> > >
> > > Thank you
> > >
> > > Raffi Basmajian
> > >
> > >
> > > ---------------------------------------------------------------------
> > ---------
> > > This e-mail transmission may contain information that is proprietary,
> > privileged and/or confidential and is intended exclusively for the
> > person(s) to whom it is addressed. Any use, copying, retention or
> > disclosure by any person other than the intended recipient or the
> > intended recipient's designees is strictly prohibited. If you are not
> > the intended recipient or their designee, please notify the sender
> > immediately by return e-mail and delete all copies. OppenheimerFunds
> > may, at its sole discretion, monitor, review, retain and/or disclose
> > the content of all email communications.
> > >
> > =======================================================================
> > =======
> > >
> > >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message