flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lakshmi Rao (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-9692) Adapt maxRecords parameter in the getRecords call to optimize bytes read from Kinesis
Date Fri, 29 Jun 2018 20:05:00 GMT
Lakshmi Rao created FLINK-9692:
----------------------------------

             Summary: Adapt maxRecords parameter in the getRecords call to optimize bytes
read from Kinesis 
                 Key: FLINK-9692
                 URL: https://issues.apache.org/jira/browse/FLINK-9692
             Project: Flink
          Issue Type: Improvement
          Components: Kinesis Connector
    Affects Versions: 1.4.2, 1.5.0
            Reporter: Lakshmi Rao


The Kinesis connector currently has a [constant value|https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java#L213] set
for maxRecords that it can fetch from a single Kinesis getRecords call. However, in most realtime
scenarios, the average size of the Kinesis record (in bytes) changes depending on the situation
i.e. you could be in a transient scenario where you are reading large sized records and would
hence like to fetch fewer records in each getRecords call (so as to not exceed the 2 Mb/sec
[per shard limit|https://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html] on
the getRecords call). 

The idea here is to adapt the Kinesis connector to identify an average batch size prior to
making the getRecords call, so that the maxRecords parameter can be appropriately tuned before
making the call. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message