kylin-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhong Yanghong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KYLIN-3342) Cubing level calculation inconsistent?
Date Wed, 18 Apr 2018 07:57:00 GMT

    [ https://issues.apache.org/jira/browse/KYLIN-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16442078#comment-16442078
] 

Zhong Yanghong commented on KYLIN-3342:
---------------------------------------

For spark cubing, before getting the {{totalLevels}}, the statistics for the segment has been
calculated. Therefore, it's OK to get it via {{CuboidScheduler.getBuildLevel()}}. However,
for the first building of layered cubing, we don't know the statistics when create the {{CubingJob}}.
Therefore, it's better for us to invoke {{CuboidUtil.getLongestDepth(...)}} to estimate a
minimum {{totalLevels}} to reduce the layers for layered cubing. After calculating statistics,
thee total depth may also change for {{TreeCuboidScheduler}}. Then in {{CuboidJob}}, there's
a check for whether to skip.

> Cubing level calculation inconsistent?
> --------------------------------------
>
>                 Key: KYLIN-3342
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3342
>             Project: Kylin
>          Issue Type: Bug
>            Reporter: liyang
>            Priority: Major
>         Attachments: KYLIN-3342.patch
>
>
> Got below exception during cube build.
> {{ Caused by: java.lang.IndexOutOfBoundsException: Index: 7, Size: 7}}
> {{ at java.util.ArrayList.rangeCheck(ArrayList.java:635)}}
> {{ at java.util.ArrayList.get(ArrayList.java:411)}}
> {{ at org.apache.kylin.engine.mr.common.CubeStatsReader.estimateLayerSize(SourceFile:280)}}
> {{ at org.apache.kylin.engine.spark.SparkCubingByLayer.estimateRDDPartitionNum(SourceFile:219)}}
> {{ at org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SourceFile:199)}}
> {{ at org.apache.kylin.common.util.AbstractApplication.execute(SourceFile:37)}}
> {{ ... 6 more}}
>  
> Found two way of calculating the level of cuboids
>  * via CuboidScheduler.getBuildLevel()
>  * via CuboidUtil.getLongestDepth(...)
> We should settle down on one approach.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message