hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "edward yoon" <edw...@udanax.org>
Subject Re: Newbie user questions
Date Mon, 03 Mar 2008 06:21:55 GMT
I think unified API design and easy guidance are needed.
Therefore, i think hbase default client APIs should be mapping to HQL
client api.

I would like to get an objective opinion.

Thanks,
Edward.

On 3/3/08, Bryan Duxbury <bryan@rapleaf.com> wrote:
> Alex,
>
> The HBase shell is meant only to be used for administrative purposes,
> like managing tables. You can do limited CRUD operations, but they're
> mostly there for the benefit of initial testing and tracking down
> bugs. HQL is also not SQL, so you shouldn't anticipate there being
> many SQL features.
>
> In the Java, REST and Thrift APIs for HBase, there are two types of
> accesses - single-row gets and multi-row scans. There are a lot of
> options surrounding gets, so there's probably something that matches
> your needs, but you have to know the row key to start with. Scans are
> used whenever you need to operate on a number of rows. The cursor
> model is indeed the closest analogy for a scanner.
>
> If you need to do a join in the traditional sense, then yes, you need
> to have at least two scanners and do the joining yourself. However,
> if possible, you might want to consider denormalizing the data from
> the two tables you'd be joining into a single table. I don't mean one
> row per <table1,table2> tuple - HBase supports an arbitrary number of
> columns per row, so if your second table is really a subordinate
> entity, you might get some benefit from moving all to one table.
>
> The return values for scanners are Java Maps containing your data
> (assuming you're in the Java API). Does that answer your question?
>
> -Bryan
>
> On Mar 2, 2008, at 7:01 PM, alexthompson@sitelabs.com wrote:
>
> >
> > Newbie user questions. Can you correct me if I am wrong in my
> > following statements:
> >
> > I have looked into querying against hBase and come up with a few
> > paths to do this, from the hBase shell I can use HQL, from code I
> > am limited to scanners which are roughly analogous to cursors, I
> > 'obtain' a scanner and iterate over a table starting at a row, and
> > once I have a row I can test values in columns.
> >
> > Thus for a 'SQL' type join I can fire up 2+ scanners on different
> > tables and iterate over both testing as I go - performance
> > problems?, is there a more efficient way to do this or are scanners
> > innately efficient?
> >
> > One other thing I can't see is the return value for a query, do I
> > build my own collection and hand it back to my calling methods - or
> > do we have some helper collection objects ( I noticed 'formatter')
> > to do this.
> >
> > Cheers, Alex. Any help much appreciated.
>
>


-- 
B. Regards,
Edward yoon @ NHN, corp.

Mime
View raw message