From dev-return-11700-archive-asf-public=cust-asf.ponee.io@airflow.apache.org Mon Jun 15 23:39:43 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id D88A6180656 for ; Tue, 16 Jun 2020 01:39:42 +0200 (CEST) Received: (qmail 59129 invoked by uid 500); 15 Jun 2020 23:39:42 -0000 Mailing-List: contact dev-help@airflow.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.apache.org Delivered-To: mailing list dev@airflow.apache.org Received: (qmail 59113 invoked by uid 99); 15 Jun 2020 23:39:41 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Jun 2020 23:39:41 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 189CDC01D4 for ; Mon, 15 Jun 2020 23:39:41 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.25 X-Spam-Level: X-Spam-Status: No, score=0.25 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.2, KAM_LOTSOFHASH=0.25, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-he-de.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id tNuL3x7EHvXr for ; Mon, 15 Jun 2020 23:39:38 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=2a00:1450:4864:20::641; helo=mail-ej1-x641.google.com; envelope-from=kaxilnaik@gmail.com; receiver= Received: from mail-ej1-x641.google.com (mail-ej1-x641.google.com [IPv6:2a00:1450:4864:20::641]) by mx1-he-de.apache.org (ASF Mail Server at mx1-he-de.apache.org) with ESMTPS id 1A5817E155 for ; Mon, 15 Jun 2020 23:39:38 +0000 (UTC) Received: by mail-ej1-x641.google.com with SMTP id k11so19391871ejr.9 for ; Mon, 15 Jun 2020 16:39:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=/Tclv0TMKrj/lGEk8JaQWyj5r5QwAPHEQ1xJgC9loWw=; b=KtqBHyZU/VEHkN3MYGcBVcbfo/NStoocboDpoIvXgBy/IjT3sXgSat4lw/zH3m0TjY dipEH1msrTSVaFv4L1RpPqpZ7oX68QjU8503lVPRL0tGW0CBXcGy3VaDYl3k8uaHRiKL NPyPsGaUYjoDj3S+z4PsvTN4nCTGhfQlPkNUxVOGvMd/BQjWa4B5yPv2Klua769qQAbD MHoy1XJddf7pzn92Tl0Wc7fQdftggbHqVbL1+lStmukz3k9qOMTMe25cep85L2XpFnor EeLzpjxgZWGy5Xl+FmojmdlRLwPMJozPxLABqdSrD5MUaAqitNDWCqnthMhski2SSfil 0YJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=/Tclv0TMKrj/lGEk8JaQWyj5r5QwAPHEQ1xJgC9loWw=; b=qy/g+R/B7CL+xx/uWLWsOpSmO0+SNWTkNC2VANTfeMCj7LOxmOMGuxcwAtqJ8wOFuB UNE3u68UhA8Ek3Sv4XO6ikZljGrFQMkLBKZhJkdKuDBoyP7bfWxOkqpSx73ugvaXMlvp mhM1xpcE/t3ulHdpRFsPaeRzxBhFqSjtg+0uZdOLZJbydSQfAaruaPE8+f1hwdYO4c/s lMrOF/Ff4SjJqtSDg+mr+/P8H6GOF3m+Qxg71yM6aRk76JAN8AJdhwAeJ+11Lt5TZYm7 GgveTWCt+CECVycR77dg0GQtjIfpuznxlODicsHUcMp32khFKeB9lK0VFQGFJ+cZ+ZDe MwyQ== X-Gm-Message-State: AOAM533kdKgkdcuMbFSsUO9zLHf8e7J05Q0p7asuh7Erq8hCLpcW8kET vfTqVZVzz4s96jNFOrt8APYA3LNAHnFn3kxP72YfvN7Bluc= X-Google-Smtp-Source: ABdhPJwz8ZzzTOvVqpNGZ8VM/gBj6ypg1aCksoMIqHzlUrZV90MgiyQ9OW/JeaS1F0t2W3OMrjOdSQfwaB7YcqzKZHY= X-Received: by 2002:a17:906:abca:: with SMTP id kq10mr226149ejb.390.1592264317449; Mon, 15 Jun 2020 16:38:37 -0700 (PDT) MIME-Version: 1.0 References: <37b4a904-f06e-47c6-89df-d9451aa73cef@Spark> <9a57901b-db54-45a7-b60d-a3a7507945c1@Spark> <41eb75a6-2ffb-48f4-8b28-6ac64ebd290d@Spark> <2343e9e7-9ba3-41f1-821e-ed845a6ec1eb@Spark> In-Reply-To: <2343e9e7-9ba3-41f1-821e-ed845a6ec1eb@Spark> From: Kaxil Naik Date: Tue, 16 Jun 2020 00:38:26 +0100 Message-ID: Subject: Re: [DISCUSS] Parametrized DAGs To: dev@airflow.apache.org Content-Type: multipart/alternative; boundary="00000000000007ba5605a827edcc" --00000000000007ba5605a827edcc Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Oh yes that sounds good, +1 to the idea as long as it can return a JSON serializable object I am fine with it. On Tue, Jun 16, 2020 at 12:29 AM Gerard Casas Saez wrote: > By XCom support before XComArg I meant as XCom parameters for operators. > You needed to use {{contex[=E2=80=98ti=E2=80=99].xcom_pull(=E2=80=A6)}} i= nstead of using XComArg > objects like you can do as latest master. > > Gerard Casas Saez > Twitter | Cortex | @casassaez > On Jun 15, 2020, 5:02 PM -0600, Kaxil Naik , wrote: > > Isn't it already possible using params ( > > > https://github.com/apache/airflow/blob/master/airflow/models/dag.py#L138-= L141 > > )? > > > > Sample Usage: > > > https://gist.github.com/kaxil/335d90da8821a4e515046ff0f470fc97#file-airfl= ow_params_usage_2-py > > > > Currently, we allowing passing params in the DAG and overriding the > params > > using dagrun_conf via CLI or UI: > > > > Code: > > > > - > > > https://github.com/apache/airflow/blob/3de68501b7a76dce24bfd8a8b4659eedcf= 7ac29c/airflow/models/taskinstance.py#L1335-L1336 > > - > > > https://github.com/apache/airflow/blob/3de68501b7a76dce24bfd8a8b4659eedcf= 7ac29c/airflow/models/taskinstance.py#L1454-L1456 > > > > > > Or am I missing something? > > > > Regards, > > Kaxil > > > > On Mon, Jun 15, 2020 at 11:48 PM Gerard Casas Saez > > wrote: > > > > > I do not think we should support RunTimeParams to modify the topology > (at > > > least at the beginning). > > > > > > Modify the topology involves quite a bit more of deeper changes. Even > > > though it may be useful, I believe the value/time tradeoff, is high, = so > > > focusing on enabling parametrization on fixed topology is definitely = an > > > easier step to focus on and will probs bring enough value. > > > > > > Curious what are other people thoughts on this? > > > > > > Gerard Casas Saez > > > Twitter | Cortex | @casassaez > > > On Jun 12, 2020, 10:00 AM -0600, Dan Davydov > , > > > wrote: > > > > I think this is a great idea! One thing that I think we should > figure out > > > > before implementing is how to do so alongside DAG serialization, i.= e. > > > > letting these params modify DAG topology might make it hard to stor= e > > > > serialized representations for the Airflow services to consume and > > > render, > > > > though that may be more of a statement about the dagrun > configuration and > > > > orthogonal to the change proposed here. > > > > > > > > On Thu, Jun 11, 2020 at 7:58 PM Gerard Casas Saez > > > > wrote: > > > > > > > > > As we wrap the work on AIP-31 (functional definition), I wanted t= o > > > bring > > > > > another idea here for discussion. > > > > > > > > > > The concept is to parametrize pipelines using a similar class tha= n > > > XComArg > > > > > that we introduced recently. As of 1.10.10, we can use the UI to > set > > > the > > > > > DagRun configuration on the trigger DAG view using a json blob. > > > > > > > > > > Accessing those is still hard (you need to pull DagRun from curre= nt > > > > > context and then access the conf object). My proposal would be to > add > > > a new > > > > > class that is resolved on execution similar to how we resolve > XComArgs. > > > > > > > > > > class DAGParam(key:str, defaul:Any, type:type): > > > > > > > > > > > > > > > def resolve(dag_run: DagRun): > > > > > > > > > > return dag_run.conf[self.key] > > > > > > > > > > > > > > > # Raw usage: > > > > > > > > > > > > > > > with DAG(...) as dag: > > > > > > > > > > param =3D DAGParam(key=3D'number', default=3D3, type=3Dint) > > > > > > > > > > SomeOperator(num=3Dparam) > > > > > > > > > > > > > > > # From DAG object > > > > > > > > > > > > > > > with DAG(...) as dag: > > > > > > > > > > SomeOperator(num=3Ddag.param(key=3D'number', default=3D3, type=3D= int)) > > > > > > > > > > > > > > > # Decorator approach: > > > > > > > > > > > > > > > @dag(...) > > > > > > > > > > def my_dag(number:int=3D3): > > > > > > > > > > SomeOperator(num=3Dnumber) > > > > > > > > > > > > > > > Gist: > https://gist.github.com/casassg/aa29b4d5d7f07f16630e591e351e570a > > > > > > > > > > This would allow us to discover this params and surface them in t= he > > > Trigger > > > > > DAG UI > > > > > > > > https://airflow.apache.org/blog/airflow-1.10.10/#allow-passing-dagrun-con= f-when-triggering-dags-via-ui > > > > > as > > > > > better form similar to what we currently have at Twitter (see > > > > > DagConstructors here > > > > > < > > > > https://blog.twitter.com/engineering/en_us/topics/insights/2018/ml-workfl= ows.html > > > > > or > > > > > image attached) > > > > > > > > > > Just wanted to drop this here to get people thoughts! > > > > > > > > > > The idea is heavily inspired by Kubeflow PipelinesParams + pipeli= ne > > > > > decorator. > > > > > > > > > > Gerard Casas Saez > > > > > Twitter | Cortex | @casassaez > > > > > > > > > --00000000000007ba5605a827edcc--