kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chao Long <chao.long0...@gmail.com>
Subject Re: 分发中间表数据倾斜
Date Fri, 14 Jun 2019 02:52:16 GMT
Hi wang,
   "DISTRIBUTE BY RAND()" may cause data inconsistency, so we changed it to
distribute by the first few columns of  the rowkey, and the default is the
first 3 columns. You can see the following issues for more details. If you
don't have a data skew problem, you can just disable the "redistribute"
step by setting "kylin.source.hive.redistribute-flat-table" to false.
https://issues.apache.org/jira/browse/KYLIN-3388
https://issues.apache.org/jira/browse/KYLIN-3457

On Thu, Jun 13, 2019 at 5:03 PM ning.wang@ymm56.com <ning.wang@ymm56.com>
wrote:

> 按照文档说法,重新分发中间表的时候是随机方式DISTRIBUTE BY
> RAND(),我的cube里没有指定分片字段,但是不是按照随机方式分发的,而是取的维度字段里的前3个字段,由于cube里的维度没有高基维度导致数据倾斜,
> 怎么设置才能随机分发呢,或者有什么好的建议
>
> ------------------------------
> ning.wang@ymm56.com
>
Mime
View raw message