From dev-return-5244-archive-asf-public=cust-asf.ponee.io@airflow.incubator.apache.org Mon May 28 16:17:14 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 5F571180608 for ; Mon, 28 May 2018 16:17:14 +0200 (CEST) Received: (qmail 42415 invoked by uid 500); 28 May 2018 14:17:13 -0000 Mailing-List: contact dev-help@airflow.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.incubator.apache.org Delivered-To: mailing list dev@airflow.incubator.apache.org Received: (qmail 42403 invoked by uid 99); 28 May 2018 14:17:12 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 28 May 2018 14:17:12 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 450F7C004C for ; Mon, 28 May 2018 14:17:12 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.981 X-Spam-Level: * X-Spam-Status: No, score=1.981 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=driesprongen-nl.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id ZZyYoIjgYOBX for ; Mon, 28 May 2018 14:17:09 +0000 (UTC) Received: from mail-yw0-f169.google.com (mail-yw0-f169.google.com [209.85.161.169]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 0CFA95F254 for ; Mon, 28 May 2018 14:17:09 +0000 (UTC) Received: by mail-yw0-f169.google.com with SMTP id v68-v6so3900495ywd.3 for ; Mon, 28 May 2018 07:17:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=driesprongen-nl.20150623.gappssmtp.com; s=20150623; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to; bh=7XoRA8BSrCbKmayUwLN7e791pVA+0XRQT/HqPwloNoA=; b=fCyyR053EQJawMQKuOA3jw9BrDoAWp6lCIdqjInlBa/IUEdomp3MycqSCpTcmv5MDv OX2utF6Wn9j7GYMnoattJXiSBgxZuVAHhIyyrVK6dzmDoZFkhxPtNHjMvdRdOG7c2g4j N9XcWG6RdFjeiPDwm2c+/4XPCvXqhmGiZMxZGOgKrYx882d40d2fM2S8qMFpbt4cW0b3 7FSEtNgvqJPkgrdBYGmJkmkTkzifffFr4dEsSRF8JJ9yWv5tBmBQedDqgQJ629onm6fJ b9D3j3Yob64ljwciGkRxFuupT7Kd250pGd++EmJeEH1GKOxYD7AmGYV2NuUVqFL8nXZK /bZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to; bh=7XoRA8BSrCbKmayUwLN7e791pVA+0XRQT/HqPwloNoA=; b=aaK1DXuwMcTW3HS5Q6f5XlfqNlwRSyoQKN21x6o0xjrStFerPM9TNtymVdkQ+VaA0m bDsy7Ydn6j9gwuLi6MdTlmSk55y8iYfO3XfqG+/pN+ErE2jM3L2/FAa+CtHQyizWAYrY AOREtuIOaMFLdGXC7Scok7bXs+b0LzzBqNuhjM0BZPtZhFY21N/aUL+l12SHVRwaUlV5 91fQ7UXI8trj+dYQDhzs7Ah74PgZAc1+3AyoFkTJhAlUv1zggkarVI/FOSBeOYAHM1FJ E0WlBpvDUFETCFczdswmWiq5f+8smU5+ty9DVy01s6lBMDnmJ4RNrk8eAeVFAnhIXn/U uB6g== X-Gm-Message-State: ALKqPweCPSQPQ24QKLBJNanjG2VGvlfVBvH3Jvahu2HFT8lBOuCfRTja WelPCemWTlb6i0JS8tn6v63mby28b2oXOJTUs5aMgQ== X-Google-Smtp-Source: ADUXVKKV7ftPh/GlVm23ONifkke6rN1ezF+9jL/xLNMkUDruvE4M23Siiu7B3BOE2hv1Kkbty/w20qBVvQeds3xTKrk= X-Received: by 2002:a81:d402:: with SMTP id z2-v6mr7057390ywi.99.1527517021262; Mon, 28 May 2018 07:17:01 -0700 (PDT) MIME-Version: 1.0 Sender: fokko@driesprongen.nl Received: by 2002:a25:5c88:0:0:0:0:0 with HTTP; Mon, 28 May 2018 07:17:00 -0700 (PDT) In-Reply-To: <4f524e0b-f6ae-6536-c0c2-32c83c8a91eb@stefan-seelmann.de> References: <4f524e0b-f6ae-6536-c0c2-32c83c8a91eb@stefan-seelmann.de> From: "Driesprong, Fokko" Date: Mon, 28 May 2018 16:17:00 +0200 X-Google-Sender-Auth: tDhQ-ZkPAQsf2-P5yV6y66l-CIU Message-ID: Subject: Re: How to wait for external process To: dev@airflow.incubator.apache.org Content-Type: multipart/alternative; boundary="0000000000007102b5056d44c57e" --0000000000007102b5056d44c57e Content-Type: text/plain; charset="UTF-8" Hi Stefan, Afaik there isn't a more efficient way of doing this. DAGs that are relying on a lot of sensors are experiencing the same issues. The only way right now, I can think of, is doing updating the state directly in the database. But then you need to know what you are doing. I can image that this would be feasible by using an AWS lambda function. Hope this helps. Cheers, Fokko 2018-05-26 17:50 GMT+02:00 Stefan Seelmann : > Hello, > > I have a DAG (externally triggered) where some processing is done at an > external system (EC2 instance). The processing is started by an Airflow > task (via HTTP request). The DAG should only continue once that > processing is completed. In a first naive implementation I created a > sensor that gets the progress (via HTTP request) and only if status is > "finished" returns true and the DAG run continues. That works but... > > ... the external processing can take hours or days, and during that time > a worker is occupied which does nothing but HTTP GET and sleep. There > will be hundreds of DAG runs in parallel which means hundreds of workers > are occupied. > > I looked into other operators that do computation on external systems > (ECSOperator, AWSBatchOperator) but they also follow that pattern and > just wait/sleep. > > So I want to ask if there is a more efficient way to build such a > workflow with Airflow? > > Kind Regards, > Stefan > --0000000000007102b5056d44c57e--