beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xu Mingmin (JIRA)" <>
Subject [jira] [Commented] (BEAM-1775) fix issue of start_from_previous_offset in KafkaIO
Date Tue, 21 Mar 2017 17:18:41 GMT


Xu Mingmin commented on BEAM-1775:

As i mention in mail-list, each unbounded IO should try its best to restore from last offset,
when `CheckpointMark` is not provided by runners.

Will start to work on KafkaIO, besides earliest, latest, uncommitted_earliest and uncommitted_latest
will be supported. With the two new options, it restores the offset of last run if available.

> fix issue of start_from_previous_offset in KafkaIO
> --------------------------------------------------
>                 Key: BEAM-1775
>                 URL:
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-extensions
>            Reporter: Xu Mingmin
>            Assignee: Davor Bonaci
> Jins George via 
> 5:50 PM (15 hours ago)
> to user
> Hello,
> I am writing a Beam pipeline(streaming) with Flink runner to consume data from Kafka
and apply some transformations and persist to Hbase.
> If I restart the application ( due to failure/manual restart), consumer does not resume
from the offset where it was prior to restart. It always resume from the latest offset.
> If I enable Flink checkpionting with hdfs state back-end, system appears to be resuming
from the earliest offset
> Is there a recommended way to resume from the offset where it was stopped ?
> Thanks,
> Jins George

This message was sent by Atlassian JIRA

View raw message