From commits-return-14230-archive-asf-public=cust-asf.ponee.io@airflow.incubator.apache.org Wed Apr 25 00:35:09 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id A1118180671 for ; Wed, 25 Apr 2018 00:35:08 +0200 (CEST) Received: (qmail 95953 invoked by uid 500); 24 Apr 2018 22:35:07 -0000 Mailing-List: contact commits-help@airflow.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.incubator.apache.org Delivered-To: mailing list commits@airflow.incubator.apache.org Received: (qmail 95943 invoked by uid 99); 24 Apr 2018 22:35:07 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Apr 2018 22:35:07 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 310A6C1227 for ; Tue, 24 Apr 2018 22:35:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -110.301 X-Spam-Level: X-Spam-Status: No, score=-110.301 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id P9_yV7gUyuai for ; Tue, 24 Apr 2018 22:35:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 703945F5AF for ; Tue, 24 Apr 2018 22:35:06 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id D206AE120A for ; Tue, 24 Apr 2018 22:35:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 5B4CA241C7 for ; Tue, 24 Apr 2018 22:35:00 +0000 (UTC) Date: Tue, 24 Apr 2018 22:35:00 +0000 (UTC) From: "Xiao Zhu (JIRA)" To: commits@airflow.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (AIRFLOW-2372) SubDAGs should share parallelism of parent DAG MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Xiao Zhu created AIRFLOW-2372: --------------------------------- Summary: SubDAGs should share parallelism of parent DAG Key: AIRFLOW-2372 URL: https://issues.apache.org/jira/browse/AIRFLOW-2372 Project: Apache Airflow Issue Type: Wish Affects Versions: Airflow 1.9.0 Environment: 1.9.0 a local scheduler and LocalExecutor, and parallelism = 32, dag_concurrency = 16 Reporter: Xiao Zhu It seems like right now subDAGs are scheduled just like normal DAGs, so if a DAG has a lot of (parallel) subDAGs with each having a lot of operators, triggering that DAG means those subDAGs will gets triggered as normal DAGs, and they can easily take all the resources (limited by dag_concurrency) of the scheduler, and other DAGs have to wait for those subDAGs. For example, if I have this DAG, with a local scheduler and LocalExecutor, and parallelism = 32, dag_concurrency = 16 {quote} from airflow.operators.dummy_operator import DummyOperator from airflow.operators.python_operator import PythonOperator from airflow.operators.subdag_operator import SubDagOperator NUM_SUBDAGS = 20 NUM_OPS_PER_SUBDAG = 10 def logging_func(id): log.info("Now running id: {}".format(id)) def build_dag(dag_id, num_ops): dag = DAG(dag_id) start_op = DummyOperator(task_id='start', dag=dag) for i in range(num_ops): op = PythonOperator( task_id=str(i), python_callable=logging_func, op_args=[i], dag=dag ) start_op >> op return dag parent_id = 'consistent_failure' with DAG( parent_id ) as dag: start_op = DummyOperator(task_id='start') for i in range(NUM_SUBDAGS): task_id = "subdag_{}".format(i) op = SubDagOperator( task_id=task_id, subdag=build_dag("{}.{}".format(parent_id, task_id), NUM_OPS_PER_SUBDAG) ) start_op >> op {quote} When I trigger this DAG, Airflow tries to run a lot of the subDAGs at the same time, and since they don't share the parallelism with their parent DAG, each of them tries to run their operators in parallel, which results in 500+ python processes created by Airflow. Ideally those subDAGs should share parallelism with their parent DAG (and thus with each other too), so when I trigger this DAG, at any time only up to 32 operators, including the ones in the subDAGs, are running. -- This message was sent by Atlassian JIRA (v7.6.3#76005)