kylin-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yifei Wu (JIRA)" <>
Subject [jira] [Commented] (KYLIN-3078) the estimated size of percentile measure is too big
Date Sat, 02 Dec 2017 15:59:00 GMT


Yifei Wu commented on KYLIN-3078:

the key is to clarify the percentile impact on cube size estimate and find a more proper way
to estimate the size of percentile measure.
For the measure use the T-digest Algorithm to realize it, so it can conclude some regular
pattern by the analysis from the T-digest paper and the statistics collected in the local

> the estimated size of percentile measure  is too big
> ----------------------------------------------------
>                 Key: KYLIN-3078
>                 URL:
>             Project: Kylin
>          Issue Type: Bug
>            Reporter: Yifei Wu
>            Assignee: Yifei Wu
>            Priority: Critical
> To set a shard number that will be for controlling the size per shard properly, we need
to estimate cube size through accumulating all dimension and measure size roughly before building
a cube. But the way of calculating the percentile measure is inaccurate currently and cause
too many partitions for cube storage. Furthermore, it may affect the performance of SQL query.

This message was sent by Atlassian JIRA

View raw message