hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michel Segel <michael_se...@hotmail.com>
Subject Re: How to design a data warehouse in HBase?
Date Thu, 13 Dec 2012 08:43:31 GMT
You need to spend a bit of time on Schema design.
You need to flatten your Schema...
Implement some secondary indexing to improve join performance...

Depends on what you want to do... There are other options too...

Sent from a remote device. Please excuse any typos...

Mike Segel

On Dec 13, 2012, at 7:09 AM, lars hofhansl <lhofhansl@yahoo.com> wrote:

> For OLAP type queries you will generally be better off with a truly column oriented database.
> You can probably shoehorn HBase into this, but it wasn't really designed with raw scan
performance along single columns in mind.
> ________________________________
> From: bigdata <bigdatabase@outlook.com>
> To: "user@hbase.apache.org" <user@hbase.apache.org> 
> Sent: Wednesday, December 12, 2012 9:57 PM
> Subject: How to design a data warehouse in HBase?
> Dear all,
> We have a traditional star-model data warehouse in RDBMS, now we want to transfer it
to HBase. After study HBase, I learn that HBase is normally can be query by rowkey.
> 1.full rowkey (fastest)2.rowkey filter (fast)3.column family/qualifier filter (slow)
> How can I design the HBase tables to implement the warehouse functions, like:1.Query
by DimensionA2.Query by DimensionA and DimensionB3.Sum, count, distinct ...
> From my opinion, I should create several HBase tables with all combinations of different
dimensions as the rowkey. This solution will lead to huge data duplication. Is there any good
suggestions to solve it?
> Thanks a lot!

View raw message