flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2314) Make Streaming File Sources Persistent
Date Tue, 25 Aug 2015 15:57:46 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711509#comment-14711509
] 

ASF GitHub Bot commented on FLINK-2314:
---------------------------------------

Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/997#discussion_r37883013
  
    --- Diff: flink-staging/flink-streaming/flink-streaming-core/src/main/java/org/apache/flink/streaming/api/functions/source/FileSourceFunction.java
---
    @@ -120,12 +131,24 @@ public void run(SourceContext<OUT> ctx) throws Exception {
     			OUT nextElement = serializer.createInstance();
     			nextElement =  format.nextRecord(nextElement);
     			if (nextElement == null && splitIterator.hasNext()) {
    -				format.open(splitIterator.next());
    +				InputSplit split = splitIterator.next();
    +				splitNumber = split.getSplitNumber();
    +				currRecord = 0l;
    +				format.open(split);
     				continue;
     			} else if (nextElement == null) {
     				break;
     			}
    -			ctx.collect(nextElement);
    +			if(splitNumber == checkpointedSplit){
    --- End diff --
    
    What if you've checkpointed the 2. split after seeing the 1. and 2. split and now the
source is re-executed with the first split? Aren't records written again because you only
save the latest checkpointed split number?


> Make Streaming File Sources Persistent
> --------------------------------------
>
>                 Key: FLINK-2314
>                 URL: https://issues.apache.org/jira/browse/FLINK-2314
>             Project: Flink
>          Issue Type: Improvement
>          Components: Streaming
>    Affects Versions: 0.9
>            Reporter: Stephan Ewen
>            Assignee: Sheetal Parade
>              Labels: easyfix, starter
>
> Streaming File sources should participate in the checkpointing. They should track the
bytes they read from the file and checkpoint it.
> One can look at the sequence generating source function for an example of a checkpointed
source.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message