kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 赵天烁 <zhaotians...@meizu.com>
Subject why distribute by partition column while creating flat hive table?
Date Tue, 23 Aug 2016 10:34:39 GMT
I have a table with huge data increasment every day,bilion level.when I build a cube relate
to that table,it stuck in creating flat hive table....for ever.
I check the mr process and found that the task sql in this step is ended with "DISTRIBUTE
BY  ${partition date column}"
I try to manually execute the same sql,but remove the " distribute by ", then everything goes
fine with in 10 min.
as far as I know this step of create a flat table is helpful when I have a star schema,but
what I only have is that fact table. so why bother to create a table with the same structure
even the data are the same?the only different is the table name....
so I think is it possible to just create a view with intermediate table name that kylin need
when I havn't define any lookup table?this way will eliminate that long term task which seems
like achieved nothing.

________________________________
赵天烁
Kevin Zhao
zhaotianshuo@meizu.com<mailto:zhaotianshuo@meizu.com>

珠海市魅族科技有限公司
MEIZU Technology Co., Ltd.
广东省珠海市科技创新海岸魅族科技楼
MEIZU Tech Bldg., Technology & Innovation Coast
Zhuhai, 519085, Guangdong, China
meizu.com
Mime
View raw message