flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Question About "Preserve Partitioning" in Stream Iteration
Date Fri, 31 Jul 2015 12:22:48 GMT
I thought about having some tighter restrictions here. My idea was to
enforce that the feedback edges must have the same parallelism as the
original input stream, otherwise shipping strategies such as "keyBy",
"shuffle", "rebalance" don't seem to make sense because they would differ
from the distribution of the original elements (at least IMHO). Maybe I'm
wrong there, though.

To me it seems intuitive that I get the feedback at the head they way I
specify it at the tail. But maybe that's also just me... :D

On Fri, 31 Jul 2015 at 14:00 Gyula Fóra <gyfora@apache.org> wrote:

> Hey,
>
> I am not sure what is the intuitive behaviour here. As you are not applying
> a transformation on the feedback stream but pass it to a closeWith method,
> I thought it was somehow nature that it gets the partitioning of the
> iteration input, but maybe its not intuitive.
>
> If others also think that preserving feedback partitioning should be the
> default I am not against it :)
>
> Btw, this still won't make it very simple. We still need as many
> source/sink pairs as we have different parallelism among the head
> operators. Otherwise the forwarding logic wont work.
>
> Cheers,
> Gyula
>
> Aljoscha Krettek <aljoscha@apache.org> ezt írta (időpont: 2015. júl. 31.,
> P, 11:52):
>
> > Hi,
> > I'm currently working on making the StreamGraph generation more
> centralized
> > (i.e. not spread across the different API classes). The question is now
> why
> > we need to switch to preserve partitioning? Could we not make "preserve"
> > partitioning the default and if users want to have shuffle partitioning
> or
> > anything they have to specify it manually when adding the feedback edge?
> >
> > This would make for a very simple scheme where the iteration sources are
> > always connected to the heads using "forward" and the tails are connected
> > to the iteration sinks using whatever partitioner was set by the user.
> This
> > would make it more transparent than the current default of the "shuffle"
> > betweens tails and iteration sinks.
> >
> > Cheers,
> > Aljoscha
> >
> > P.S. I now we had quite some discussion about introducing "preserve
> > partitioning" but now, when I think of it it should be the default... :D
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message