Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 0613D200C4F for ; Sat, 18 Mar 2017 00:33:19 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 04928160B8C; Fri, 17 Mar 2017 23:33:19 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 4C8FF160B80 for ; Sat, 18 Mar 2017 00:33:18 +0100 (CET) Received: (qmail 41776 invoked by uid 500); 17 Mar 2017 23:33:17 -0000 Mailing-List: contact dev-help@airflow.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.incubator.apache.org Delivered-To: mailing list dev@airflow.incubator.apache.org Received: (qmail 41762 invoked by uid 99); 17 Mar 2017 23:33:17 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Mar 2017 23:33:17 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id E250818F171 for ; Fri, 17 Mar 2017 23:33:16 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.119 X-Spam-Level: X-Spam-Status: No, score=-1.119 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, KAM_INFOUSMEBIZ=0.75, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, RP_MATCHES_RCVD=-0.001, SPF_NEUTRAL=0.652] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id kztV7ab-MkMl for ; Fri, 17 Mar 2017 23:33:14 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id 058515FBE6 for ; Fri, 17 Mar 2017 23:33:13 +0000 (UTC) Received: (qmail 41752 invoked by uid 99); 17 Mar 2017 23:33:13 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Mar 2017 23:33:13 +0000 Received: from mail-qk0-f175.google.com (mail-qk0-f175.google.com [209.85.220.175]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id DE5FE1A0193 for ; Fri, 17 Mar 2017 23:33:12 +0000 (UTC) Received: by mail-qk0-f175.google.com with SMTP id p64so76552935qke.1 for ; Fri, 17 Mar 2017 16:33:12 -0700 (PDT) X-Gm-Message-State: AFeK/H0yB36tSM03oXwu4rgX050qhi9hooH7fso/rAd5zBf+/OreZSO2v6ed6ccCm6jwhJy7JO+jlHKih31CBg== X-Received: by 10.55.176.198 with SMTP id z189mr15208998qke.14.1489793592033; Fri, 17 Mar 2017 16:33:12 -0700 (PDT) MIME-Version: 1.0 Received: by 10.200.50.196 with HTTP; Fri, 17 Mar 2017 16:33:11 -0700 (PDT) From: siddharth anand Date: Fri, 17 Mar 2017 16:33:11 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Reminder : LatestOnlyOperator To: dev@airflow.incubator.apache.org Content-Type: multipart/alternative; boundary=94eb2c060d02d7af20054af599a1 archived-at: Fri, 17 Mar 2017 23:33:19 -0000 --94eb2c060d02d7af20054af599a1 Content-Type: text/plain; charset=UTF-8 With the Apache Airflow 1.8 release imminent, you may want to try out the *LatestOnlyOperator.* If you want your DAG to only run on the most recent scheduled slot, regardless of backlog, this operator will skip running downstream tasks for all DAG Runs prior to the current time slot. For example, I might have a DAG that takes a DB snapshot once a day. It might be that I paused that DAG for 2 weeks or that I had set the start date to a fixed data 2 weeks in the past. When I enable my DAG, I don't want it to run 14 days' worth of snapshots for the current state of the DB -- that's unnecessary work. The LatestOnlyOperator avoids that work. https://github.com/apache/incubator-airflow/commit/edf033be65b575f44aa221d5d0ec9ecb6b32c67a With it, you can simply use latest_only = LatestOnlyOperator(task_id='latest_only', dag=dag) instead of def skip_to_current_job(ds, **kwargs): now = datetime.now() left_window = kwargs['dag'].following_schedule(kwargs['execution_date']) right_window = kwargs['dag'].following_schedule(left_window) logging.info(('Left Window {}, Now {}, Right Window {}').format(left_window,now,right_window)) if not now <= right_window: logging.info('Not latest execution, skipping downstream.') return False return True short_circuit = ShortCircuitOperator( task_id = 'short_circuit_if_not_current_job', provide_context = True, python_callable = skip_to_current_job, dag = dag ) -s --94eb2c060d02d7af20054af599a1--