hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Kellerman <...@powerset.com>
Subject RE: Newbie user questions
Date Mon, 03 Mar 2008 17:25:15 GMT
-1

HBase is a Bigtable clone not a relational database or a column
oriented database like cstore.

---
Jim Kellerman, Senior Engineer; Powerset


> -----Original Message-----
> From: edward yoon [mailto:edward@udanax.org]
> Sent: Sunday, March 02, 2008 10:22 PM
> To: hbase-user@hadoop.apache.org
> Subject: Re: Newbie user questions
>
> I think unified API design and easy guidance are needed.
> Therefore, i think hbase default client APIs should be
> mapping to HQL client api.
>
> I would like to get an objective opinion.
>
> Thanks,
> Edward.
>
> On 3/3/08, Bryan Duxbury <bryan@rapleaf.com> wrote:
> > Alex,
> >
> > The HBase shell is meant only to be used for administrative
> purposes,
> > like managing tables. You can do limited CRUD operations,
> but they're
> > mostly there for the benefit of initial testing and tracking down
> > bugs. HQL is also not SQL, so you shouldn't anticipate there being
> > many SQL features.
> >
> > In the Java, REST and Thrift APIs for HBase, there are two types of
> > accesses - single-row gets and multi-row scans. There are a lot of
> > options surrounding gets, so there's probably something
> that matches
> > your needs, but you have to know the row key to start with.
> Scans are
> > used whenever you need to operate on a number of rows. The cursor
> > model is indeed the closest analogy for a scanner.
> >
> > If you need to do a join in the traditional sense, then
> yes, you need
> > to have at least two scanners and do the joining yourself.
> However, if
> > possible, you might want to consider denormalizing the data
> from the
> > two tables you'd be joining into a single table. I don't
> mean one row
> > per <table1,table2> tuple - HBase supports an arbitrary number of
> > columns per row, so if your second table is really a subordinate
> > entity, you might get some benefit from moving all to one table.
> >
> > The return values for scanners are Java Maps containing your data
> > (assuming you're in the Java API). Does that answer your question?
> >
> > -Bryan
> >
> > On Mar 2, 2008, at 7:01 PM, alexthompson@sitelabs.com wrote:
> >
> > >
> > > Newbie user questions. Can you correct me if I am wrong in my
> > > following statements:
> > >
> > > I have looked into querying against hBase and come up with a few
> > > paths to do this, from the hBase shell I can use HQL,
> from code I am
> > > limited to scanners which are roughly analogous to cursors, I
> > > 'obtain' a scanner and iterate over a table starting at a
> row, and
> > > once I have a row I can test values in columns.
> > >
> > > Thus for a 'SQL' type join I can fire up 2+ scanners on different
> > > tables and iterate over both testing as I go - performance
> > > problems?, is there a more efficient way to do this or
> are scanners
> > > innately efficient?
> > >
> > > One other thing I can't see is the return value for a query, do I
> > > build my own collection and hand it back to my calling
> methods - or
> > > do we have some helper collection objects ( I noticed
> 'formatter')
> > > to do this.
> > >
> > > Cheers, Alex. Any help much appreciated.
> >
> >
>
>
> --
> B. Regards,
> Edward yoon @ NHN, corp.
>
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.5.516 / Virus Database: 269.21.3/1307 - Release
> Date: 3/2/2008 3:59 PM
>
>

No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.5.516 / Virus Database: 269.21.3/1308 - Release Date: 3/3/2008 10:01 AM


Mime
View raw message