kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ShaoFeng Shi <shaofeng...@apache.org>
Subject Re: kylin预计算很慢
Date Fri, 26 Apr 2019 10:04:49 GMT
As I know, the Spark engine in "Fact distinct" step has a performance
problem if there is an ultra high cardinality dimension. The fix will be
released in v2.6.2. Please switch to MR engine if encounter problem now.
Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Email: shaofengshi@apache.org

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscribe@kylin.apache.org
Join Kylin dev mail group: dev-subscribe@kylin.apache.org




zjt <zhao_jintao@163.com> 于2019年4月26日周五 下午3:26写道:

> 是否有去重?有高基维?
>
>
> 赵金涛
> 邮箱shqmh@126.com
>
> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=%E8%B5%B5%E9%87%91%E6%B6%9B&uid=shqmh%40126.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22%E9%82%AE%E7%AE%B1shqmh%40126.com%22%5D>
>
> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail88>
定制
>
> 在2019年04月26日 15:05,gaofeng5096@capinfo.com.cn 写道:
>
> 我们集群的资源调的已经很大了,还是这样,是维度设计的不合理么?
> ------------------------------
> gaofeng5096@capinfo.com.cn
>
>
> *发件人:* JiaTao Tao <taojiatao@gmail.com>
> *发送时间:* 2019-04-25 09:47
> *收件人:* user <user@kylin.apache.org>
> *主题:* Re: kylin预计算很慢
> Hi
>
> As Jintao said, check spark jobs in spark UI (tasks, input/output/,
> shuffle, etc.)
>
>
> --
>
>
> Regards!
>
> Aron Tao
>
>
> zjt <zhao_jintao@163.com> 于2019年4月24日周三 下午11:18写道:
>
>> 这一步是执行spark任务,你到yarn 上看一下spark任务执行情况。资源不够就加参数,调资源
>>
>>
>>
>> 赵金涛
>> 邮箱shqmh@126.com
>>
>> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=%E8%B5%B5%E9%87%91%E6%B6%9B&uid=shqmh%40126.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22%E9%82%AE%E7%AE%B1shqmh%40126.com%22%5D>
>>
>> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail88>
定制
>>
>> 在2019年04月24日 11:35,gaofeng5096@capinfo.com.cn 写道:
>> kylin在预计算过程中,我们通过spark作为计算引擎来跑的任务,就3个维度,几百万的数据量却计算了很长时间,这个是怎么回事?
>>
>> ------------------------------
>> gaofeng5096@capinfo.com.cn
>>
>>
>
>
>
Mime
View raw message