flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bowen Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-7622) Respect local KPL queue size in FlinkKinesisProducer when adding records to KPL client
Date Thu, 14 Sep 2017 20:24:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16166925#comment-16166925
] 

Bowen Li commented on FLINK-7622:
---------------------------------

[~tzulitai] I think FLINK-7508 indirectly solves this issue, since FLINK-7508 greatly increased
the throughput of Kinesis producer. Here's an example:

https://imgur.com/a/u198h  Here are the metrics of our prod Flink job - UserRecords Put (aka
UserRecords put into KPL queue), and user_records_pending (aka outstanding UserRecords waiting
to be sent to AWS in the queue).

When using per_request threading model, a small number (~0.5million) of UserRecords can cause
huge number (~15k) of records pending because the throughput is so low. After switching to
pooled threading model, you can see the number of outstanding UserRecords has dropped significantly
(~0) even though the number of UserRecords put into the queue grow to 16X bigger (~8million
at peak). Take a closer look at our user_records_pending metric for the past two weeks at
https://imgur.com/a/2YxIm, the # of outstanding records is consistently under 150 (impressive,
right?). 

Thus, I believe propagating back pressure to upstream for FlinkKinesisProducer is not necessary
anymore.

> Respect local KPL queue size in FlinkKinesisProducer when adding records to KPL client
> --------------------------------------------------------------------------------------
>
>                 Key: FLINK-7622
>                 URL: https://issues.apache.org/jira/browse/FLINK-7622
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kinesis Connector
>            Reporter: Tzu-Li (Gordon) Tai
>
> This issue was brought to discussion by [~sthm] offline.
> Currently, records are added to the Kinesis KPL producer client without checking the
number of outstanding records within the local KPL queue. This manner is basically neglecting
backpressure when producing to Kinesis through KPL, and can therefore exhaust system resources.
> We should respect {{producer.getOutstandingRecordsCount()}} as a measure of backpressure,
and propagate backpressure upstream by blocking further sink invocations when some threshold
of outstanding record count is exceeded. The recommended threshold [1] seems to be 10,000.
> [1] https://aws.amazon.com/blogs/big-data/implementing-efficient-and-reliable-producers-with-the-amazon-kinesis-producer-library/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message