flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piotr Nowojski <pi...@data-artisans.com>
Subject Re: Rebalance to subtasks in same TaskManager instance
Date Tue, 06 Feb 2018 08:27:51 GMT

Unfortunately I don’t think it’s currently possible in the Flink. Please feel free to
submit a feature request for it on our JIRA https://issues.apache.org/jira/projects/FLINK/summary

Have you tried out the setup using rebalance? In most cases overhead of rebalance over rescale
is not that high as one might think.


> On 5 Feb 2018, at 15:16, johannes.barnard@clarivate.com wrote:
> Hi,
> I have a streaming topology with source parallelism of M and a target
> operator parallelism of N.
> For optimum performance I have found that I need to choose M and N
> independently.
> Also, the source subtasks do not all produce the same number of records and
> therefor I have to rebalance to the target operator to get optimum
> throughput.
> The record sizes vary a lot (up to 10MB) but are about 200kB on average.
> Through experimentation using the rescale() operator I have found that
> maximum throughput can be significantly increased if I restrict this
> rebalancing to target subtasks within the same TaskManager instances.
> However I cannot use rescale for this purpose as it does not do a
> rebalancing to all target subtasks in the instance.
> I was hoping to use a custom Partitioner to achieve this but it is not clear
> to me which partition would correspond to which subTask.
> Is there any way currently to achieve this with Flink? 
> If it helps I believe the feature I am hoping to achieve is similar to
> Storm's "Local or shuffle grouping".
> Any help or suggestions will be appreciated.
> Hans
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

View raw message