kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sonny Heer <sonnyh...@gmail.com>
Subject Re: Hive table design (multiple fact tables or rolled up)
Date Thu, 09 Mar 2017 16:57:35 GMT
>
> Thanks ShaoFeng.  That makes sense.  Can you talk about having two cubes
> vs one and if queries go across cubes (e.g. on columns from both cubes)?


On Wed, Mar 8, 2017 at 9:22 PM, ShaoFeng Shi <shaofengshi@apache.org> wrote:

> In Kylin 1.x as it only allow one Fact table, you have to join multiple
> big tables as one before using in Cube. Of course you can do that with a
> Hive view, so all joins are over fly when Kylin fetches data from Hive. The
> impact is, when query Kylin you need to use the name of the joined table,
> instead of original fact table name.
>
> From Kylin 2.x, it supports multiple Fact tables in one Cube; You don't
> need create additional view or flat table, just use the original names.
>
> 2017-03-09 9:45 GMT+08:00 Sonny Heer <sonnyheer@gmail.com>:
>
>> to clarify in my use case the data can be organized to either have a
>> couple fact tables or a large single one.  Queries are open ended at this
>> point.  queries may cross facts or may not.
>>
>> On Wed, Mar 8, 2017 at 5:13 PM, Sonny Heer <sonnyheer@gmail.com> wrote:
>>
>>> Let me put it anther way.  assume a SALES table and a PRODUCT table.
>>> This is not highly normalized in the grand scheme of things, but somewhat.
>>> The question is what benefit is there to denormalize this further into a
>>> single table for kylin.  i read something about hierarchical dimensions.
>>> So from Kylin perspective which is better. have one-to-many in a single
>>> table or some normalized form?
>>>
>>> On Wed, Mar 8, 2017 at 4:24 PM, Billy Liu <billyliu@apache.org> wrote:
>>>
>>>> please check star schema first: https://en.wikipedia.or
>>>> g/wiki/Star_schema
>>>>
>>>> 2017-03-08 12:48 GMT-08:00 Sonny Heer <sonnyheer@gmail.com>:
>>>>
>>>>> Hi I'm somewhat new to Kylin.  we have a relational db schema imported
>>>>> into hive as is at the moment.  The schema is highly normalized with
lots
>>>>> of tables.  I can see this database having multiple fact tables or a
>>>>> handful of fact tables.
>>>>>
>>>>> In Kylin I see when creating a model (star) you have the option to
>>>>> pick a single fact table...meaning there is a single cube per fact table.
>>>>> Please provide pros/cons on running transformations to denormalize the
>>>>> tables into a single table vs keeping lots of tables with many fact/lookup
>>>>> tables.
>>>>>
>>>>> In short:
>>>>> should we do any transformations in Hive before presenting the tables
>>>>> to kylin for cubing?...
>>>>>
>>>>> Thanks
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>>
>>> Pushpinder S. Heer
>>> Senior Software Engineer
>>> m: 360-434-4354 <(360)%20434-4354> h: 509-884-2574 <(509)%20884-2574>
>>>
>>
>>
>>
>> --
>>
>>
>> Pushpinder S. Heer
>> Senior Software Engineer
>> m: 360-434-4354 <(360)%20434-4354> h: 509-884-2574 <(509)%20884-2574>
>>
>
>
>
> --
> Best regards,
>
> Shaofeng Shi 史少锋
>
>


-- 


Pushpinder S. Heer
Senior Software Engineer
m: 360-434-4354 h: 509-884-2574

Mime
View raw message