kylin-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shaofeng SHI (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KYLIN-3696) TOPN度量在同一个模型下2个cube同时开启统计值不准与真实值差得较多
Date Tue, 27 Nov 2018 09:21:00 GMT

    [ https://issues.apache.org/jira/browse/KYLIN-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16700106#comment-16700106
] 

Shaofeng SHI commented on KYLIN-3696:
-------------------------------------

[~yangwei] If a cube has little measure and they are in the same HBase column family, Kylin
will directly store the bytes to HBase KeyValue, this will avoid one deserialize and serialization.
The bug exists in TopN's serialization. So I think this is why your first cube doesn't have
this problem.

> TOPN度量在同一个模型下2个cube同时开启统计值不准与真实值差得较多
> ------------------------------------
>
>                 Key: KYLIN-3696
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3696
>             Project: Kylin
>          Issue Type: Bug
>          Components: Measure - TopN
>    Affects Versions: v2.5.1
>            Reporter: yangwei
>            Priority: Major
>         Attachments: image-2018-11-20-10-57-28-546.png, image-2018-11-20-11-01-25-120.png,
image-2018-11-20-11-27-43-750.png
>
>
> 我使用的是v2.5.1,度量topN使用上出现不准的总量。
> 问题再现:
> 一,二个cube使用同一个模型就是同一张物理事实表。
> 二,二个cube同时包含相同的topN度量
> 三,二个cube状态都是Ready
> 目前我暂时的解决方法是在其中一个cube去掉一个topN度量
> 同一个sql在hive与kylin里查的的结果对不上相差很远,下面给出sql
> SELECT IP ,
>  SUM(ACCESS_COUNT) c
> FROM API_ACCESS
> WHERE TAG_DATE = CAST('2018-11-19' AS DATE)
>  group by ip
> ORDER BY 
>  c DESC
> LIMIT 10;
> 二个cube中的度量:
>  cube1:
> !image-2018-11-20-10-57-28-546.png!
> cube2:
> !image-2018-11-20-11-01-25-120.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message