beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (BEAM-2467) KinesisIO watermark based on approximateArrivalTimestamp
Date Mon, 25 Sep 2017 22:36:00 GMT


ASF GitHub Bot commented on BEAM-2467:

Github user asfgit closed the pull request at:

> KinesisIO watermark based on approximateArrivalTimestamp
> --------------------------------------------------------
>                 Key: BEAM-2467
>                 URL:
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-extensions
>            Reporter: Paweł Kaczmarczyk
>            Assignee: Paweł Kaczmarczyk
> In Kinesis we can start reading the stream at some point in the past during the retention
period (up to 7 days). With current approach for setting record's timestamp and watermark
(both are always set to current time, i.e., we can't observe the actual position
in the stream.
> So the idea is to change this behaviour and set the record timestamp based on the [ApproximateArrivalTimestamp|].
Watermark will be set accordingly to the last read record's timestamp. 
> ApproximateArrivalTimestamp is still some approximation and may result in having records
with out-of-order timestamp's which in turn may result in some events marked as late. This
however should not be a frequent issue and even if it happens it should be a matter of milliseconds
or seconds so can be handled even with a tiny allowedLateness setting

This message was sent by Atlassian JIRA

View raw message