flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-3336) Add Semi-Rebalance Data Shipping for DataStream
Date Fri, 05 Feb 2016 12:59:39 GMT

    [ https://issues.apache.org/jira/browse/FLINK-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15134130#comment-15134130

ASF GitHub Bot commented on FLINK-3336:

Github user tillrohrmann commented on the pull request:

    There are still some rebalance vs. rescale mix-ups in the documentation. I haven't looked
over all docs but I spotted two.

> Add Semi-Rebalance Data Shipping for DataStream
> -----------------------------------------------
>                 Key: FLINK-3336
>                 URL: https://issues.apache.org/jira/browse/FLINK-3336
>             Project: Flink
>          Issue Type: Improvement
>          Components: Streaming
>            Reporter: Aljoscha Krettek
>            Assignee: Aljoscha Krettek
>             Fix For: 1.0.0
> This feature has recently been requested on the ML: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Distribution-of-sinks-among-the-nodes-td4640.html
> The new data shipping pattern would allow to rebalance data only to a subset of downstream
> The subset of downstream operations to which the upstream operation would send
> elements depends on the degree of parallelism of both the upstream and downstream operation.
> For example, if the upstream operation has parallelism 2 and the downstream operation
> has parallelism 4, then one upstream operation would distribute elements to two
> downstream operations while the other upstream operation would distribute to the other
> two downstream operations. If, on the other hand, the downstream operation had parallelism
> 2 while the upstream operation has parallelism 4 then two upstream operations would
> distribute to one downstream operation while the other two upstream operations would
> distribute to the other downstream operations.
> In cases where the different parallelisms are not multiples of each other one or several
> downstream operations would have a differing number of inputs from upstream operations.

This message was sent by Atlassian JIRA

View raw message