kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 高扬02 <>
Subject 答复: 答复: cube关联后的distinct问题
Date Mon, 29 Oct 2018 10:06:28 GMT
Hi Xiaoxiang,
I tried to query pushdown to solve this case , but it is too slow to meet the need of us.
I will try something else to solve this problem.
Thank you for your advice.

Best regards,
Yang Gao

发件人: Xiaoxiang Yu []
发送时间: 2018年10月29日 17:39
主题: Re: 答复: cube关联后的distinct问题

Hello gaoyang,

In your case,  you have two fact tables which both has one column "user_id", these two columns
both have duplicate values so you cannot use one of these fact tables as a lookup table. You
want get the distinct count of user_id which occur in both two sides.

If what I understand is correct, I guess you should enable query pushdown to let other sql
engine to do such task.

Best wishes,
Xiaoxiang Yu

发件人: 高扬02 <<>>
答复: "<>" <<>>
日期: 2018年10月29日 星期一 16:50
收件人: "<>" <<>>
主题: 答复: cube关联后的distinct问题

Hi Xiaoxiang,
Thank you for your reply.
So you mean , we cannot use Kylin to deal with the intersection of two sets from two fact
tables , is that true ?
If that is true , I cannot use “join” between two fact tables in two cubes.
And I cannot create a snowflake schema with the tow fact tables , because they have multi-multi
records after “join” , and the Exception is ” Dup key found” .
Please give me a suggestion , how to get the intersection of two sets from  two fact tables.
Thanks a lot.
Best regards.
Yang Gao

发件人: Xiaoxiang Yu []
发送时间: 2018年10月29日 15:44
主题: Re: cube关联后的distinct问题
重要性: 高

Hi, kylin is a OLAP engine, it support star schema and snowflake schema. Both star schema
and snowflake schema support only one fact table. If these two tables cannnot be found in
one model, it cannot be executed on kylin.
As FAQ says []

Is Kylin a generic SQL engine for big data?

  *   No, Kylin is an OLAP engine with SQL interface. The SQL queries need be matched with
the pre-defined OLAP model.

Xiaoxiang Yu

发件人: 高扬02 <<>>
答复: "<>" <<>>
日期: 2018年10月25日 星期四 15:10
收件人: "<>" <<>>
主题: cube关联后的distinct问题



select count(distinct a.r_id) from
(select r.user_id as r_id,l.user_id as l_id from
(select user_id from register where data_day='2018-09-01' and app_id='403_20170811' )r
inner join
(select user_id from login where data_day='2018-09-01' and app_id='403_20170811')l
on l.user_id=r.user_id

Can't create EnumerableAggregate! while executing SQL: "select count(distinct a.r_bmid) from
(select r.bmid_user as r_bmid,l.bmid_user as l_bmid from (select bmid_user from wl_register_bm_mes
where data_day='2018-09-01' and app_id='403_20170811' )r inner join (select bmid_user from
wl_login_new_mes where data_day='2018-10-14' and app_id='403_20170811')l on l.bmid_user=r.bmid_user
)a LIMIT 50000"

View raw message