hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Buttler, David" <buttl...@llnl.gov>
Subject RE: How to join tables in HBase 20.3
Date Fri, 19 Mar 2010 16:27:36 GMT
This particular query seems quite straight forward:
Scan the claim table with a filter that only returns entries from the last month.  Get the
customer id and policy id from the claim record (e.g. the foreign keys).  Use Get to retrieve
data from the customer and policy tables.  Is the complaint that you have to write this yourself,
or that there is no referential integrity between the tables, or something else?

Dave

-----Original Message-----
From: Basmajian, Raffi [mailto:rbasmajian@oppenheimerfunds.com] 
Sent: Friday, March 19, 2010 9:20 AM
To: hbase-user@hadoop.apache.org
Subject: RE: How to join tables in HBase 20.3

JG,

I understand that there is no built in mechanism to do joins, but the
essence of combining data to make it more useful remains the same
regardless of whether it's a rdmbs, hbase, etc, so there must be
something in hbase that provided this functionality.

Assume for the moment that in hbase I have the tables Customer, Policy,
and Claim for an auto insurance business. Say I want to get a list of
all customers that filed a claim on their auto policy in the past month.
If I use Get and/or Scan then that allows me to pull information from
each individual table, but I still need to combine the data to give me
the list of policies based on my original query. Is there additional
functionality in hbase that enables combining the data? I've been
searching in the samples and I can't find a clear and simple example.

Thanks
Raffi
 

-----Original Message-----
From: Jonathan Gray [mailto:jgray@facebook.com] 
Sent: Friday, March 19, 2010 12:03 PM
To: hbase-user@hadoop.apache.org
Subject: RE: How to join tables in HBase 20.3

At some point joins may be necessary when denormalization is not
possible.

There is no built-in mechanism to do it.  It would be a series of
additional Get calls to the second table you are joining against.  This
would be helped significantly with a parallel MultiGet which will
hopefully make it to 0.21.

JG

> -----Original Message-----
> From: TuX RaceR [mailto:tuxracer69@gmail.com]
> Sent: Friday, March 19, 2010 8:41 AM
> To: hbase-user@hadoop.apache.org
> Subject: Re: How to join tables in HBase 20.3
> 
> Hi Raffi,
> 
> when dealing with key-value stores, you need to think in a different 
> way see for instance:
> 
> http://*wiki.apache.org/hadoop/Hbase/DataModel
> 
> "Getting high scalability from your relational database isn't done by 
> simply adding more machines because its data model is based on a 
> single-machine architecture. For example, a JOIN between two tables is

> done in memory and does not take into account the possibility that the

> data has to go over the wire."
> 
> JOIN simply does not scale in relational databases.
> 
> 
> see also
> 
> http://*wiki.apache.org/hadoop/Hbase/FAQ#A20
> 
> *20 Are there any Schema Design examples?*
> 
> 
> Hope this helps,
> 
> Cheers
> TuX
> 
> 
> Basmajian, Raffi wrote:
> > I am new to HBase and come from a rdbms background. After looking in
> the
> > sample client code it seems fairly easy to query a single table 
> > using Get and Scan, but it's not so obvious how to join data across
> multiple
> > tables.
> >
> > Are there any examples on how to read/join data across multiple
> tables?
> >
> > Thank you
> >
> > Raffi Basmajian
> >
> >
> > --------------------------------------------------------------------
> > -
> ---------
> > This e-mail transmission may contain information that is 
> > proprietary,
> privileged and/or confidential and is intended exclusively for the
> person(s) to whom it is addressed. Any use, copying, retention or 
> disclosure by any person other than the intended recipient or the 
> intended recipient's designees is strictly prohibited. If you are not 
> the intended recipient or their designee, please notify the sender 
> immediately by return e-mail and delete all copies. OppenheimerFunds 
> may, at its sole discretion, monitor, review, retain and/or disclose 
> the content of all email communications.
> >
> ======================================================================
> =
> =======
> >
> >



------------------------------------------------------------------------------
This e-mail transmission may contain information that is proprietary, privileged and/or confidential
and is intended exclusively for the person(s) to whom it is addressed. Any use, copying, retention
or disclosure by any person other than the intended recipient or the intended recipient's
designees is strictly prohibited. If you are not the intended recipient or their designee,
please notify the sender immediately by return e-mail and delete all copies. OppenheimerFunds
may, at its sole discretion, monitor, review, retain and/or disclose the content of all email
communications. 
==============================================================================



Mime
View raw message