hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pamecha, Abhishek" <apame...@paypal.com.INVALID>
Subject RE: How to Manage Data Architecture & Modeling for HBase
Date Mon, 06 Apr 2015 17:09:23 GMT
I would stress that if you envision any joins or arbitrary slices and dices at a later point
in your application, you might want to either redesign your schema "very carefully"  or be
ready for more time consuming ( not near real time) answers. We had explored a possible solution
on similar lines but a hashtable approach (as expected)  isn’t the best for database joins
OR slicing based on arbitrary columns across the whole dataset. We had to switch back to a
relational db for our usecase.


-----Original Message-----
From: Michael Segel [mailto:michael_segel@hotmail.com] 
Sent: Monday, April 06, 2015 9:55 AM
To: user@hbase.apache.org
Cc: user@phoenix.apache.org
Subject: Re: How to Manage Data Architecture & Modeling for HBase

I should add that in terms of financial modeling… 

Its easier to store derivatives and synthetic instruments because you aren’t really constrained
by a relational model. 
(Derivatives are nothing more than a contract.) 



> On Apr 6, 2015, at 8:34 AM, Ben Liang <liangpc@hotmail.com> wrote:
> Thank you for your prompt reply.
> In my daily work, I mainly used Oracle DB to build a data warehouse with star topology
data modeling, about financial analysis and marketing analysis.
> Now I trying to use Hbase to do it. 
> I has a question,
> 1) many tables from ERP should be Incremental loading every day , 
> Including some insert and some update,  this scenario is appropriate 
> to use  hbase to build data worehose?
> 2) Is there some case about Enterprise BI Solutions with HBASE? 
> thanks.
> Regards,
> Ben Liang
>> On Apr 6, 2015, at 20:27, Michael Segel <michael_segel@hotmail.com> wrote:
>> Yeah. Jean-Marc is right. 
>> You have to think more in terms of a hierarchical model where you’re modeling records
not relationships. 
>> Your model would look like a single ER box per record type. 
>> The HBase schema is very simple.  Tables, column families and that’s it for static
structures.  Even then, column families tend to get misused. 
>> If you’re looking at a relational model… Phoenix or Splice Machines would allow
you to do something… although Phoenix is still VERY primitive. 
>> (Do they take advantage of cell versioning like spice machines yet? )
>> There are a couple of interesting things where you could create your 
>> own modeling tool / syntax (relationships)…
>> 1) HBase is more 3D than RDBMS 2D and similar to ORDBMSs. 
>> 2) You can join entities on either a FK principle or on a weaker relationship type.

>> HBase stores CLOBS/BLOBs in each cell. Its all just byte arrays with a finite bounded
length not to exceed the size of a region. So you could store an entire record as a CLOB within
a cell.  Its in this sense that a cell can represent multiple attributes of your object/record
that you gain an additional dimension and why you only need to use a single data type. 
>> HBase and Hadoop in general allow one to join orthogonal data sets that have a weak
relationship.  So while you can still join sets against a FK which implies a relationship,
you don’t have to do it. 
>> Imagine if you wanted to find out the average cost of a front end collision by car
of college aged drivers by major. 
>> You would be joining insurance records against registrations for all of the universities
in the US for those students between the ages of 17 and 25. 
>> How would you model this when in fact neither defining attribute is a FK? 
>> (This is why you need a good Secondary Indexing implementation and 
>> not something brain dead that wasn’t alcohol induced. ;-)
>> Does that make sense? 
>> Note: I don’t know if anyone like CCCis, Allstate, State Farm, or Progressive Insurance
are doing anything like this. But they could.
>>> On Apr 5, 2015, at 7:54 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org>
>>> Not sure you want to ever do that... Designing an HBase application 
>>> is far different from designing an RDBMS one. Not sure those tools fit well here.
>>> What's you're goal? Designing your HBase schema somewhere and then 
>>> let the tool generate your HBase tables?
>>> 2015-04-05 18:26 GMT-04:00 Ben Liang <liangpc@hotmail.com>:
>>>> Hi all,
>>>>      Do you have any tools to manage Data Architecture & Modeling 
>>>> for HBase( or Phoenix) ?  Can we  use Powerdesinger or ERWin to do it?
>>>>      Please give me some advice.
>>>> Regards,
>>>> Ben Liang
>> The opinions expressed here are mine, while they may reflect a cognitive thought,
that is purely accidental. 
>> Use at your own risk. 
>> Michael Segel
>> michael_segel (AT) hotmail.com

The opinions expressed here are mine, while they may reflect a cognitive thought, that is
purely accidental. 
Use at your own risk. 
Michael Segel
michael_segel (AT) hotmail.com

View raw message