flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5937) Add documentation about the task lifecycle.
Date Fri, 03 Mar 2017 11:30:45 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15894182#comment-15894182
] 

ASF GitHub Bot commented on FLINK-5937:
---------------------------------------

Github user kl0u commented on a diff in the pull request:

    https://github.com/apache/flink/pull/3429#discussion_r104136196
  
    --- Diff: docs/internals/task_lifecycle.md ---
    @@ -0,0 +1,186 @@
    +---
    +title:  "Task Lifecycle"
    +nav-title: Task Lifecycle
    +nav-parent_id: internals
    +nav-pos: 5
    +---
    +<!--
    +Licensed to the Apache Software Foundation (ASF) under one
    +or more contributor license agreements.  See the NOTICE file
    +distributed with this work for additional information
    +regarding copyright ownership.  The ASF licenses this file
    +to you under the Apache License, Version 2.0 (the
    +"License"); you may not use this file except in compliance
    +with the License.  You may obtain a copy of the License at
    +
    +  http://www.apache.org/licenses/LICENSE-2.0
    +
    +Unless required by applicable law or agreed to in writing,
    +software distributed under the License is distributed on an
    +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    +KIND, either express or implied.  See the License for the
    +specific language governing permissions and limitations
    +under the License.
    +-->
    +
    +A task in Flink is the basic unit of execution. It is the place where each parallel instance
of an operator is executed. 
    +
    +This document goes through the different phases in the lifecycle of the `StreamTask`,
which is the base for all different 
    +task sub-types in Flink's streaming engine. 
    +
    +* This will be replaced by the TOC
    +{:toc}
    +
    +## Operator Lifecycle in a nutshell
    +
    +Because the task is the entity that executes your operators, its lifecycle is tightly
integrated with that of an operator. 
    +So, we will briefly mention the basic methods representing the lifecycle of an operator
before diving into those of the 
    +`StreamTask` itself. The list is presented below in the order that each of the methods
is called. Given that an operator
    +can have a user-defined function (*UDF*), below each of the operator methods we also
present (indented) the methods in 
    +the lifecycle of UDF that it calls. These methods are available if your operator extends
the `AbstractUdfStreamOperator`, 
    +which is the basic class for all operators that execute UDFs.
    +
    +        // initialization
    +        OPERATOR::setup
    +            UDF::setRuntimeContext
    +        OPERATOR::initializeState
    +        OPERATOR::open
    +            UDF::open
    +        
    +        // processing
    +        OPERATOR::processElement
    +            UDF::run
    +        OPERATOR::processWatermark
    +        
    +        // checkpointing (called asynchronously)
    +        OPERATOR::snapshotState
    +                
    +        // termination
    +        OPERATOR::close
    +            UDF::close
    +        OPERATOR::dispose
    +    
    +In a nutshell, the `setup()` is called to initialize some operator-specific machinery,
such as its `RuntimeContext` and 
    +its metric collection data-structures. After this, the `initializeState()` gives an operator
its initial state, and the 
    + `open()` method executes any operator-specific initialization, such as opening the user-defined
function in the case of 
    +the `AbstractUdfStreamOperator`. 
    +
    +<span class="label label-danger">Attention</span> The `initializeState()`
contains both the logic for initializing the 
    +state of the operator during its initial execution (*e.g.* register any keyed state),
and also the logic to retrieve its
    +state from a checkpoint after a failure. More about this on the rest of this page.
    +
    +Now that everything is set, the operator is ready to process incoming data. This is done
by invoking the `processElement()` 
    +and `processWatermark()` methods which contain the logic for processing elements and
watermarks, respectively. The 
    +`processElement()` is also the place where you function's logic is invoked, *e.g.* the
`map()` method of your `MapFunction`.
    +
    +Finally, in the case of a normal, fault-free termination of the operator (*e.g.* if the
stream is finite and its end is 
    +reached), the `close()` method is called to perform any final bookkeeping action required
by the operator's logic, and 
    +the `dispose()` is called after that to free any resources held by the operator. 
    +
    +In the case of a termination due to a failure or due to manual cancellation, the execution
jumps directly to the `dispose()` 
    +and skips any intermediate phases between the phase the operator was in when the failure
happened and the `dispose()`.
    +
    +**Checkpoints:** The `snapshotState()` method of the operator is called asynchronously
to the rest of the methods described 
    +above whenever a checkpoint barrier is received. Checkpoints are performed during the
processing phase, *i.e.* after the 
    +operator is opened and before it is closed. The responsibility of this method is to store
the current state of the operator 
    +to the specified [state backend]({{ site.baseurl }}/ops/state_backends.html) from where
it will be retrieved when 
    +the job resumes execution after a failure. Below we include a brief description of Flink's
checkpointing mechanism, 
    +and for a more detailed discussion on the principles around checkpointing in Flink please
read the corresponding documentation: 
    +[Data Streaming Fault Tolerance]({{ site.baseurl }}/internals/stream_checkpointing.html).
    +
    +## Task Lifecycle
    +
    +Following that brief introduction on the operator's main phases, this section describes
in more detail how a task calls 
    +the respective methods during its execution on a cluster. The sequence of the phases
described here is mainly included 
    +in the `invoke()` method of the `StreamTask` class. The remainder of this document is
split into two subsections, one 
    +describing the phases during a regular, fault-free execution of a task (see [Normal Execution](#normal-execution)),
and 
    +(a shorter) one describing the different sequence followed in case the task is cancelled
(see [Interrupted Execution](#interrupted-execution)), 
    +either manually, or due some other reason, *e.g.* an exception thrown during execution.
    +
    +### Normal Execution
    +
    +The steps a task goes through when executed until completion without being interrupted
are illustrated below:
    +
    +	    TASK::setInitialState
    --- End diff --
    
    In the task, all the described methods are called once.


> Add documentation about the task lifecycle.
> -------------------------------------------
>
>                 Key: FLINK-5937
>                 URL: https://issues.apache.org/jira/browse/FLINK-5937
>             Project: Flink
>          Issue Type: Bug
>          Components: Documentation
>    Affects Versions: 1.3.0
>            Reporter: Kostas Kloudas
>            Assignee: Kostas Kloudas
>             Fix For: 1.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message