flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4410) Report more information about operator checkpoints
Date Fri, 23 Dec 2016 21:05:58 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15773681#comment-15773681

ASF GitHub Bot commented on FLINK-4410:

Github user StephanEwen commented on the issue:

    I think what would put a cherry on top is if we can break the "End To End Duration" down
      - Delay till triggering (how long until all barriers were there)
      - Synchronous checkpoint time
      - Asynchronous checkpoint time
    That would help big time, as many users currently get confused when checkpoints have long
async times, assuming that the computation halts for that time.

> Report more information about operator checkpoints
> --------------------------------------------------
>                 Key: FLINK-4410
>                 URL: https://issues.apache.org/jira/browse/FLINK-4410
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing, Webfrontend
>    Affects Versions: 1.1.2
>            Reporter: Ufuk Celebi
>            Assignee: Ufuk Celebi
>             Fix For: 1.2.0
> Checkpoint statistics contain the duration of a checkpoint, measured as from the CheckpointCoordinator's
start to the point when the acknowledge message came.
> We should additionally expose
>   - duration of the synchronous part of a checkpoint
>   - duration of the asynchronous part of a checkpoint
>   - number of bytes buffered during the stream alignment phase
>   - duration of the stream alignment phase
> Note: In the case of using *at-least once* semantics, the latter two will always be zero.

This message was sent by Atlassian JIRA

View raw message