kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ShaoFeng Shi <shaofeng...@apache.org>
Subject Re: Extract Fact Table Distinct Columns Step
Date Wed, 20 Dec 2017 06:13:55 GMT
Hi Sonny,

Did you check this document, which has the description of each step:
https://kylin.apache.org/docs21/howto/howto_optimize_build.html

Besides, what's your Kylin version? and did you check the MR job progress
to see which stage is the most expensive, map or reduce, and what's the
number of the mappers and reducers; Are all mapper/reducers take a similar
time, or some specific took much longer than others?

Furthermore, for deep div, please provide the cube definition; We need to
know the dimension number, aggregation groups,  encodings method as well as
other possible factors.

2017-12-20 13:00 GMT+08:00 Sonny Heer <sonnyheer@gmail.com>:

> can someone explain what step 3 does?
>
> specifically how it relates dimensions, measures, and row keys.  our input
> fact table is abou 234 million records and this step is taking forever.
>
> we have 450gb memory with 25 slots per node, which is about 225
> concurrently running slots, and its still taking a while.
>
>  The doc just talks about looking at optimize cube, but that page talks
> about hierarchy columns and derived columns.  we dont have any lookup
> tables so no derived and there is no natural hierarchy
>
> Just trying to find what item controls why this step takes longer vs
> shorter time wise.
>
> Thanks
>



-- 
Best regards,

Shaofeng Shi 史少锋

Mime
View raw message