Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5B7482004CA for ; Wed, 11 May 2016 14:55:18 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 5A18F160A13; Wed, 11 May 2016 12:55:18 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A93311602BF for ; Wed, 11 May 2016 14:55:17 +0200 (CEST) Received: (qmail 91861 invoked by uid 500); 11 May 2016 12:55:16 -0000 Mailing-List: contact commits-help@airflow.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.incubator.apache.org Delivered-To: mailing list commits@airflow.incubator.apache.org Received: (qmail 91846 invoked by uid 99); 11 May 2016 12:55:16 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 May 2016 12:55:16 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 54050C0187 for ; Wed, 11 May 2016 12:55:16 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -4.021 X-Spam-Level: X-Spam-Status: No, score=-4.021 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id pyweEz0cLRxw for ; Wed, 11 May 2016 12:55:15 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id 4E4AC5F297 for ; Wed, 11 May 2016 12:55:14 +0000 (UTC) Received: (qmail 91464 invoked by uid 99); 11 May 2016 12:55:13 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 May 2016 12:55:13 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id CAEBA2C0451 for ; Wed, 11 May 2016 12:55:12 +0000 (UTC) Date: Wed, 11 May 2016 12:55:12 +0000 (UTC) From: "Sabeer Zaman (JIRA)" To: commits@airflow.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (AIRFLOW-104) State of `ExternalTaskSensor` task when the external task is marked as "failure" MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 11 May 2016 12:55:18 -0000 Sabeer Zaman created AIRFLOW-104: ------------------------------------ Summary: State of `ExternalTaskSensor` task when the external task is marked as "failure" Key: AIRFLOW-104 URL: https://issues.apache.org/jira/browse/AIRFLOW-104 Project: Apache Airflow Issue Type: Improvement Reporter: Sabeer Zaman Priority: Minor Dear Airflow Maintainers, Before I tell you about my issue, let me describe my environment: h3. Environment * Version of Airflow: v1.6.2 * Airflow components and configuration: Running with CeleryExecutor (separate docker containers running webserver, worker, rabbitmq and mysql db) * Operating System: {{Darwin Kernel Version 15.3.0: Thu Dec 10 18:40:58 PST 2015; root:xnu-3248.30.4~1/RELEASE_X86_64 x86_64}} * Python Version: 2.7.6 Now that you know a little about me, let me tell you about the issue I am having: h3. Description of Issue I created two DAGs - let's call them {{dag_a}} and {{dag_b}}. One of the tasks in {{dag_b}} is an {{ExternalTaskSensor}} referencing a task with {{task_id="external_task"}} in {{dag_a}}. So the code looked as shown below: {code} # in DAG definition for "dag_a" # ... imports, boilerplate setup - e.g., defining `default_args` dag = DAG(dag_id="dag_a", default_args=default_args, schedule_interval="0 0 * * *",) external_task = DummyOperator( task_id="external_task", dag=dag, ) {code} {code} # in DAG definition for "dag_b" # ... imports, boilerplate setup - e.g., defining `default_args` dag = DAG(dag_id="dag_b", default_args=default_args, schedule_interval="0 0 * * *",) task_sensor = ExternalTaskSensor( task_id="dag_a.external_task", external_dag_id="dag_a", external_task_id="external_task", dag=dag, ) {code} To test failure behavior, I marked the task with {{task_id="external_task"}} in {{dag_a}} as "failed" (for a particular execution date). I then ran the backfill for the _same execution date_ for {{dag_b}}. * What did you expect to happen? ** I expected the task named {{"dag_a.external_task"}} in {{dag_b}} to be marked either as {{failed}} or {{upstream_failed}}, since the actual task it was referencing in {{dag_a}} failed. * What happened instead? ** The log for the task {{"dag_a.external_task"}} in {{dag_b}} showed that it kept poking {{external_task}} in {{dag_a}} every minute h3. Requested Change Looking at the logic in the [{{poke}} function for the {{ExternalTaskSensor}}|https://github.com/airbnb/airflow/blob/1.7.0/airflow/operators/sensors.py#L178-L200], it's evident that it's acting as a regular Airflow Sensor and just waiting until something becomes true, and is in no way coupling the state of the current task with the state of the external task. That being said, is it reasonable to request such behavior (i.e., the {{ExternalTaskSensor}}'s state is set to failed if the task it's waiting on is marked as {{failed}})? I'd be willing to take a stab at adding the logic, but I'd like to make sure that this is in line with how this Sensor's intended behavior, or if there's a suggested alternative way of achieving this behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)