Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9A3EB200D17 for ; Sun, 8 Oct 2017 20:34:06 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 98BAD1609E6; Sun, 8 Oct 2017 18:34:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id DD63B1609D3 for ; Sun, 8 Oct 2017 20:34:05 +0200 (CEST) Received: (qmail 90920 invoked by uid 500); 8 Oct 2017 18:34:05 -0000 Mailing-List: contact commits-help@airflow.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.incubator.apache.org Delivered-To: mailing list commits@airflow.incubator.apache.org Received: (qmail 90910 invoked by uid 99); 8 Oct 2017 18:34:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 08 Oct 2017 18:34:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 2C3BA1A372E for ; Sun, 8 Oct 2017 18:34:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id t0ER7Wbh2kxr for ; Sun, 8 Oct 2017 18:34:03 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id B43AC5F642 for ; Sun, 8 Oct 2017 18:34:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 443E8E00A7 for ; Sun, 8 Oct 2017 18:34:02 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id AE6CD2435A for ; Sun, 8 Oct 2017 18:34:00 +0000 (UTC) Date: Sun, 8 Oct 2017 18:34:00 +0000 (UTC) From: "Dmytro Kulyk (JIRA)" To: commits@airflow.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (AIRFLOW-1296) DAGs using operators involving cascading skipped tasks fail prematurely MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sun, 08 Oct 2017 18:34:06 -0000 [ https://issues.apache.org/jira/browse/AIRFLOW-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16196272#comment-16196272 ] Dmytro Kulyk edited comment on AIRFLOW-1296 at 10/8/17 6:33 PM: ---------------------------------------------------------------- have you tried to play with "trigger_rule"? After update we've reverse situation when SKIPPED status is being pushed despite of trigger_rule set to "all_done" was (Author: kotyara): have you tried to play with "trigger_rule"? After update we've reverse situation when SKIPPED status is being pushed despite of trigger_rule set > DAGs using operators involving cascading skipped tasks fail prematurely > ----------------------------------------------------------------------- > > Key: AIRFLOW-1296 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1296 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler > Affects Versions: 1.8.1 > Reporter: Daniel Huang > Assignee: Bolke de Bruin > Priority: Blocker > Fix For: 1.8.2 > > > So this is basically the same issue as AIRFLOW-872 and AIRFLOW-719. A workaround had fixed this (https://github.com/apache/incubator-airflow/pull/2125), but was later reverted (https://github.com/apache/incubator-airflow/pull/2195). I totally agree with the reason for reverting, but I still think this is an issue. > The issue is related to any operators that involves cascading skipped tasks, like ShortCircuitOperator or LatestOnlyOperator. These operators mark only their *direct* downstream task as SKIPPED, but additional downstream tasks from that skipped task is left up to the scheduler to cascade the SKIPPED state (see latest only op docs about this expected behavior https://airflow.incubator.apache.org/concepts.html#latest-run-only). However, instead the scheduler marks the DAG run as FAILED prematurely before the DAG has a chance to skip all downstream tasks. > This example DAG should reproduce the issue: https://gist.github.com/dhuang/61d38fb001c3a917edf4817bb0c915f9. > Expected result: DAG succeeds with tasks - latest_only (success) -> dummy1 (skipped) -> dummy2 (skipped) -> dummy3 (skipped) > Actual result: DAG fails with tasks - latest_only (success) -> dummy1 (skipped) -> dummy2 (none) -> dummy3 (none) > I believe the results I'm seeing are because of this deadlock prevention logic, https://github.com/apache/incubator-airflow/blob/1.8.1/airflow/models.py#L4182. While that actual result shown above _could_ mean a deadlock, in this case it shouldn't be. Since this {{update_state}} logic is reached first in each scheduler run, dummy2/dummy3 don't get a chance to cascade the SKIPPED state. Commenting out that block gives me the results I expect. > [~bolke] I know you spent awhile trying to reproduce my issue and weren't able to, but I'm still hitting this on a fresh environment, default configs, sqlite/mysql dbs, local/sequential/celery executors, and 1.8.1/master. -- This message was sent by Atlassian JIRA (v6.4.14#64029)