flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lukasz Cwik <lc...@google.com.INVALID>
Subject Re: Towards a spec for robust streaming SQL, Part 1
Date Fri, 21 Apr 2017 19:36:35 GMT
The doc is a good read.

I think you do a great job of explaining table -> stream, stream -> stream,
and stream -> table when there is only one stream.
But when there are multiple streams reading/writing to a table, how does
that impact what occurs?
For example, with CoGBK you have multiple streams writing to a table, how
does that impact window merging?

On Thu, Apr 20, 2017 at 5:57 PM, Tyler Akidau <takidau@google.com.invalid>
wrote:

> Hello Beam, Calcite, and Flink dev lists!
>
> Apologies for the big cross post, but I thought this might be something all
> three communities would find relevant.
>
> Beam is finally making progress on a SQL DSL utilizing Calcite, thanks to
> Mingmin Xu. As you can imagine, we need to come to some conclusion about
> how to elegantly support the full suite of streaming functionality in the
> Beam model in via Calcite SQL. You folks in the Flink community have been
> pushing on this (e.g., adding windowing constructs, amongst others, thank
> you! :-), but from my understanding we still don't have a full spec for how
> to support robust streaming in SQL (including but not limited to, e.g., a
> triggers analogue such as EMIT).
>
> I've been spending a lot of time thinking about this and have some opinions
> about how I think it should look that I've already written down, so I
> volunteered to try to drive forward agreement on a general streaming SQL
> spec between our three communities (well, technically I volunteered to do
> that w/ Beam and Calcite, but I figured you Flink folks might want to join
> in since you're going that direction already anyway and will have useful
> insights :-).
>
> My plan was to do this by sharing two docs:
>
>    1. The Beam Model : Streams & Tables - This one is for context, and
>    really only mentions SQL in passing. But it describes the relationship
>    between the Beam Model and the "streams & tables" way of thinking, which
>    turns out to be useful in understanding what robust streaming in SQL
> might
>    look like. Many of you probably already know some or all of what's in
> here,
>    but I felt it was necessary to have it all written down in order to
> justify
>    some of the proposals I wanted to make in the second doc.
>
>    2. A streaming SQL spec for Calcite - The goal for this doc is that it
>    would become a general specification for what robust streaming SQL in
>    Calcite should look like. It would start out as a basic proposal of what
>    things *could* look like (combining both what things look like now as
> well
>    as a set of proposed changes for the future), and we could all iterate
> on
>    it together until we get to something we're happy with.
>
> At this point, I have doc #1 ready, and it's a bit of a monster, so I
> figured I'd share it and let folks hack at it with comments if they have
> any, while I try to get the second doc ready in the meantime. As part of
> getting doc #2 ready, I'll be starting a separate thread to try to gather
> input on what things are already in flight for streaming SQL across the
> various communities, to make sure the proposal captures everything that's
> going on as accurately as it can.
>
> If you have any questions or comments, I'm interested to hear them.
> Otherwise, here's doc #1, "The Beam Model : Streams & Tables":
>
>   http://s.apache.org/beam-streams-tables
>
> -Tyler
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message