kylin-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "mathias kluba (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KYLIN-3138) cuboids on-demand build
Date Fri, 02 Feb 2018 19:59:00 GMT

    [ https://issues.apache.org/jira/browse/KYLIN-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350880#comment-16350880
] 

mathias kluba commented on KYLIN-3138:
--------------------------------------

I also have that issue with cube with large amount of dimensions (ex: 300).

I know we can optimize the cube ([http://kylin.apache.org/docs21/howto/howto_optimize_cubes.html)] but
it requires to think upfront. 

It would be nice to build the 1st layer only, and aggregate "on the fly" during query time
for a missing cuboid.

 

> cuboids on-demand build
> -----------------------
>
>                 Key: KYLIN-3138
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3138
>             Project: Kylin
>          Issue Type: New Feature
>          Components: Job Engine, Query Engine, Spark Engine
>    Affects Versions: v2.2.0, v2.3.0
>            Reporter: Ruslan Dautkhanov
>            Assignee: Shaofeng SHI
>            Priority: Critical
>
> We just started using Kylin and quite like it so far.
> Although some of the datasets we have are quite wide to even consider for OLAP cubing.
> Unless those cuboids will be built on-demand.
> I know some commercial non-open source products do this successfully. 
> This idea is to build a cuboid only when a user actually needs it. 
> So for example, our BI dashboards does a certain rollup, so then a SQL
> query hits Kylin backend. Kylin realizes it hasn't built that particular cuboid just
yet,
> so immediately starts building it. Users has to wait a bit longer first time
> it request that combination of dimensions. But all other requests or requests 
> of other users will be fast from that point on.
> Kylin (or any other OLAP solution) wouldn't be feasible to use on very wide datasets

> unless this on-demand functionality is implemented. For example, some datasets we have
have 100-200 dimensions. And we don't know up front rollups users would want to do.
> Suggesting to have a new dimension build rule "lazy / on-demand". All previous rules
apply. This new rule type would mean, a cuboid for a particular set of dimensions wouldn't
be built up-front if it's marked as "lazy / on-demand". 
> Thoughts / ideas?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message