flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nguyen, Michael" <Michael.Nguye...@T-Mobile.com>
Subject Re: How does Flink handle backpressure in EMR
Date Thu, 05 Dec 2019 18:11:29 GMT
Hi Roman,

So right now we have a couple Flink jobs that consumes data from one Kinesis data stream.
These jobs vary from a simple dump into a PostgreSQL table to calculating anomalies in a 30
minute window.

One large scenario we were worried about was what if one of our jobs was taking a long time
to process the Kinesis stream data? How would we detect this scenario from within our Flink
job?

We do not want our Flink jobs to lag too far from the latest point in our Kinesis stream as
we are trying to deliver information in (near) real-time.

From: Khachatryan Roman <khachatryan.roman@gmail.com>
Date: Thursday, December 5, 2019 at 9:47 AM
To: Michael Nguyen <Michael.Nguyen79@T-Mobile.com>
Cc: Piotr Nowojski <piotr@ververica.com>, "user@flink.apache.org" <user@flink.apache.org>
Subject: Re: How does Flink handle backpressure in EMR

[External]

@Michael,
Could you please describe your topology with which operators being slow, back-pressured and
probably skews in sources?

Regards,
Roman


On Thu, Dec 5, 2019 at 6:20 PM Nguyen, Michael <Michael.Nguyen79@t-mobile.com<mailto:Michael.Nguyen79@t-mobile.com>>
wrote:
Thank you for the response Roman and Piotrek!

@Roman - can you clarify on what you mean when you mentioned Flink propagating it back to
the sources?

Also, if one of my Flink operators is processing records too slowly and is getting further
away from the latest record of my source data stream, is there a way to detect this slow processing
in Flink? Would this be detected by Flink's backpressure mechanism?

Thanks,
Michael

On 12/5/19, 7:57 AM, "Piotr Nowojski" <piotr@data-artisans.com<mailto:piotr@data-artisans.com>
on behalf of piotr@ververica.com<mailto:piotr@ververica.com>> wrote:

    [External]


    Hi Michael,

    As Roman pointed out Flink currently doesn’t support the auto-scaling. It’s on our
roadmap but it requires quite a bit of preliminary work to happen before.

    Piotrek

    > On 5 Dec 2019, at 15:32, r_khachatryan <khachatryan.roman@gmail.com<mailto:khachatryan.roman@gmail.com>>
wrote:
    >
    > Hi Michael
    >
    > Flink *does* detect backpressure but currently, it only propagates it back
    > to sources.
    > And so it doesn't support auto-scaling.
    >
    > Regards,
    > Roman
    >
    >
    > Nguyen, Michael wrote
    >> How does Flink handle backpressure (caused by an increase in traffic) in a
    >> Flink job when it’s being hosted in an EMR cluster? Does Flink detect the
    >> backpressure and auto-scales the EMR cluster to handle the workload to
    >> relieve the backpressure? Once the backpressure is gone, then the EMR
    >> cluster would scale back down?
    >
    >
    >
    >
    >
    > --
    > Sent from: https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapache-flink-user-mailing-list-archive.2336050.n4.nabble.com%2F&amp;data=02%7C01%7CMichael.Nguyen79%40t-mobile.com%7Cfda964dacbf04cc2d2cb08d7799bc347%7Cbe0f980bdd994b19bd7bbc71a09b026c%7C0%7C0%7C637111582216065152&amp;sdata=FIHbAWbkH7Tq8AjG%2BVSZoxNrOl0Bn6SukN%2BY1PyhSHg%3D&amp;reserved=0<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/>


Mime
View raw message