flink-user-zh mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benchao Li <libenc...@gmail.com>
Subject Re: Flink streaming sql是否支持两层group by聚合
Date Sat, 18 Apr 2020 02:08:09 GMT
Hi,

这个是支持的哈。
你看到的现象是因为group by会产生retract结果,也就是会先发送-[old],再发送+[new].
如果是两层的话,就成了:
第一层-[old], 第二层-[cur], +[old]
第一层+[new], 第二层[-old], +[new]

dixingxing85@163.com <dixingxing85@163.com> 于2020年4月18日周六 上午2:11写道:

>
> Hi all:
>
> 我们有个streaming sql得到的结果不正确,现象是sink得到的数据一会大一会小,*我们想确认下,这是否是个bug,
> 或者flink还不支持这种sql*。
> 具体场景是:先group by A, B两个维度计算UV,然后再group by A 把维度B的UV
sum起来,对应的SQL如下:(A ->
> dt,  B -> pvareaid)
>
> SELECT dt, SUM(a.uv) AS uv
> FROM (
>    SELECT dt, pvareaid, COUNT(DISTINCT cuid) AS uv
>    FROM streaming_log_event
>    WHERE action IN ('action1')
>       AND pvareaid NOT IN ('pv1', 'pv2')
>       AND pvareaid IS NOT NULL
>    GROUP BY dt, pvareaid
> ) a
> GROUP BY dt;
>
> sink接收到的数据对应日志为:
>
> 2020-04-17 22:28:38,727    INFO groupBy xx -> to: Tuple2 -> Sink: Unnamed (1/1)
(GeneralRedisSinkFunction.invoke:169) - receive data(false,0,86,20200417)
> 2020-04-17 22:28:38,727    INFO groupBy xx -> to: Tuple2 -> Sink: Unnamed (1/1)
(GeneralRedisSinkFunction.invoke:169) - receive data(true,0,130,20200417)
> 2020-04-17 22:28:39,327    INFO groupBy xx -> to: Tuple2 -> Sink: Unnamed (1/1)
(GeneralRedisSinkFunction.invoke:169) - receive data(false,0,130,20200417)
> 2020-04-17 22:28:39,327    INFO groupBy xx -> to: Tuple2 -> Sink: Unnamed (1/1)
(GeneralRedisSinkFunction.invoke:169) - receive data(true,0,86,20200417)
> 2020-04-17 22:28:39,327    INFO groupBy xx -> to: Tuple2 -> Sink: Unnamed (1/1)
(GeneralRedisSinkFunction.invoke:169) - receive data(false,0,86,20200417)
> 2020-04-17 22:28:39,328    INFO groupBy xx -> to: Tuple2 -> Sink: Unnamed (1/1)
(GeneralRedisSinkFunction.invoke:169) - receive data(true,0,131,20200417)
>
>
> 我们使用的是1.7.2, 测试作业的并行度为1。
> 这是对应的 issue: https://issues.apache.org/jira/browse/FLINK-17228
>
>
> ------------------------------
> dixingxing85@163.com
>
>

-- 

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: libenchao@gmail.com; libenchao@pku.edu.cn
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message