airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bolke de Bruin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-932) Backfills delete existing task instances and mark them as removed
Date Thu, 02 Mar 2017 22:08:45 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893124#comment-15893124
] 

Bolke de Bruin commented on AIRFLOW-932:
----------------------------------------

Issue resides in cli.py and will only happen when a specific task is used:

{code}
    if args.task_regex:
        dag = dag.sub_dag(
            task_regex=args.task_regex,
            include_upstream=not args.ignore_dependencies)
{code}

This creates a subset of the tasks from a dag_run, with the same name as the original. Hence
it will set a task to removed if you verify the integrity of a dag run.

get_task_instances picks up all instances, including "removed' ones from the original (whole)
dag and this is not filtered in the backfill. Hence the lists mismatch and an AirflowException
is thrown.

The quick and dirty fix is to not mark removed if run from backfill and filter the list of
"get_task_instances". 

However the functionality if sub_dag is awkward imho and might need a real fix. What do you
think?





> Backfills delete existing task instances and mark them as removed
> -----------------------------------------------------------------
>
>                 Key: AIRFLOW-932
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-932
>             Project: Apache Airflow
>          Issue Type: Sub-task
>          Components: backfill
>            Reporter: Dan Davydov
>            Priority: Blocker
>
> I'm still investigating.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message