flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chesnay Schepler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-6742) Savepoint conversion might fail if operators change
Date Sat, 27 May 2017 16:35:04 GMT

    [ https://issues.apache.org/jira/browse/FLINK-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16027493#comment-16027493

Chesnay Schepler commented on FLINK-6742:

Well, alright, technically you can add operators, so long as they don't modify chains.

The internal structure of savepoints was changed in 1.3. A 1.2 savepoint contains the state
of tasks, as a list of states of the contained operators. In 1.3 a savepoint only contains
the states of operators, the notion of tasks was removed. In order to map an old savepoint
to a new one we have to map the state of a task to the individual operators. For non-chains
this is easy, but for chains this can only be done in a reliable way if the chains don't change,
i.e no operator removed nor added.

The problem is that we don't know what happened to the missing task. It may very well be that
the task was removed on purpose and the state should be lost. But it could also be that a
user read that you can modify chains in 1.3 and did so before migrating the savepoint, this
however only works after migration.

This isn't a technical hurdle, but a safety precaution.

> Savepoint conversion might fail if operators change
> ---------------------------------------------------
>                 Key: FLINK-6742
>                 URL: https://issues.apache.org/jira/browse/FLINK-6742
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.3.0
>            Reporter: Gyula Fora
>            Priority: Critical
> Caused by: java.lang.NullPointerException
> 	at org.apache.flink.runtime.checkpoint.savepoint.SavepointV2.convertToOperatorStateSavepointV2(SavepointV2.java:171)
> 	at org.apache.flink.runtime.checkpoint.savepoint.SavepointLoader.loadAndValidateSavepoint(SavepointLoader.java:75)
> 	at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1090)

This message was sent by Atlassian JIRA

View raw message