kylin-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yifei Wu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KYLIN-3078) the estimated size of percentile measure is too big
Date Sat, 02 Dec 2017 15:59:00 GMT

    [ https://issues.apache.org/jira/browse/KYLIN-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275628#comment-16275628
] 

Yifei Wu commented on KYLIN-3078:
---------------------------------

the key is to clarify the percentile impact on cube size estimate and find a more proper way
to estimate the size of percentile measure.
For the measure use the T-digest Algorithm to realize it, so it can conclude some regular
pattern by the analysis from the T-digest paper and the statistics collected in the local
test.



> the estimated size of percentile measure  is too big
> ----------------------------------------------------
>
>                 Key: KYLIN-3078
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3078
>             Project: Kylin
>          Issue Type: Bug
>            Reporter: Yifei Wu
>            Assignee: Yifei Wu
>            Priority: Critical
>
> To set a shard number that will be for controlling the size per shard properly, we need
to estimate cube size through accumulating all dimension and measure size roughly before building
a cube. But the way of calculating the percentile measure is inaccurate currently and cause
too many partitions for cube storage. Furthermore, it may affect the performance of SQL query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message