kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ShaoFeng Shi <shaofeng...@apache.org>
Subject [KYLIN-3388] Hive data may become inconsistent after redistribution
Date Tue, 12 Jun 2018 06:11:36 GMT
Hello Kylin users,

Recently Yanghong Zhong from eBay team reported that the source data may
become inconsistent after the "Redistribute flat hive table" step. This is
caused by a bug in Hive for "distribute by rand()" statement. While Kylin
depends on this to make the data distribution more even. For more
information, please check:

https://issues.apache.org/jira/browse/KYLIN-3388

Before a hot-fix is released, we recommend you disable the redistribution
feature to ensure data's accuracy, by setting:

kylin.source.hive.redistribute-flat-table=false


in conf/kylin.properties. A restart is needed to take effect.

Thanks for the attention.

-- 
Best regards,

Shaofeng Shi 史少锋

Mime
View raw message