flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4574) Strengthen fetch interval implementation in Kinesis consumer
Date Thu, 02 Feb 2017 08:08:52 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15849627#comment-15849627

ASF GitHub Bot commented on FLINK-4574:

Github user tzulitai commented on a diff in the pull request:

    --- Diff: flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/internals/ShardConsumer.java
    @@ -88,6 +96,7 @@ protected ShardConsumer(KinesisDataFetcher<T> fetcherRef,
     							Integer subscribedShardStateIndex,
     							KinesisStreamShard subscribedShard,
     							SequenceNumber lastSequenceNum,
    +							AtomicReference<Throwable> error,
    --- End diff --
    I don't think you need to add this constructor argument here, because it isn't used in
the tests, correct?
    This protected constructor exists for testing purposes. For example, in the tests, we
mock a `KinesisProxyInterface` and inject it into a `ShardConsumer` under test through this
    On the other hand, it'll be good to add tests related to error handling across the new
threads, in which case this constructor change can be left as is.

> Strengthen fetch interval implementation in Kinesis consumer
> ------------------------------------------------------------
>                 Key: FLINK-4574
>                 URL: https://issues.apache.org/jira/browse/FLINK-4574
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kinesis Connector
>    Affects Versions: 1.1.0
>            Reporter: Tzu-Li (Gordon) Tai
>            Assignee: Wei-Che Wei
> As pointed out by [~rmetzger], right now the fetch interval implementation in the {{ShardConsumer}}
class of the Kinesis consumer can lead to much longer interval times than specified by the
user, ex. say the specified fetch interval is {{f}}, it takes {{x}} to complete a {{getRecords()}}
call, and {{y}} to complete processing the fetched records for emitting, than the actual interval
between each fetch is actually {{f+x+y}}.
> The main problem with this is that we can never guarantee how much time has past since
the last {{getRecords}} call, thus can not guarantee that returned shard iterators will not
have expired the next time we use them, even if we limit the user-given value for {{f}} to
not be longer than the iterator expire time.
> I propose to improve this by, per {{ShardConsumer}}, use a {{ScheduledExecutorService}}
/ {{Timer}} to do the fixed-interval fetching, and a separate blocking queue that collects
the fetched records for emitting.

This message was sent by Atlassian JIRA

View raw message