flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benchao Li <libenc...@gmail.com>
Subject Re: Sorting Bounded Streams
Date Sat, 30 May 2020 04:00:49 GMT
Hi Satyam,

Are you using blink planner in streaming mode? AFAIK, blink planner in
batch mode can sort on arbitrary columns.

Satyam Shekhar <satyamshekhar@gmail.com> 于2020年5月30日周六 上午6:19写道:

> Hello,
> I am using Flink as the streaming execution engine for building a
> low-latency alerting application. The use case also requires ad-hoc
> querying on batch data, which I also plan to serve using Flink to avoid the
> complexity of maintaining two separate engines.
> My current understanding is that Order By operator in Blink planner (on
> DataStream) requires time attribute as the primary sort column. This is
> quite limiting for ad-hoc querying. It seems I can use the DataSet API to
> obtain a globally sorted output on an arbitrary column but that will force
> me to use the older Flink planner.
> Specifically, I am looking for guidance from the community on the
> following questions -
>    1. Is it possible to obtain a globally sorted output on DataStreams on
>    an arbitrary sort column?
>    2. What are the tradeoffs in using DataSet vs DataStream in
>    performance, long term support, etc?
>    3. Is there any other way to address this issue?
> Regards,
> Satyam


Benchao Li

View raw message