Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 85C84200C54 for ; Wed, 12 Apr 2017 20:56:49 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 843A1160B95; Wed, 12 Apr 2017 18:56:49 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id CC023160B85 for ; Wed, 12 Apr 2017 20:56:48 +0200 (CEST) Received: (qmail 94830 invoked by uid 500); 12 Apr 2017 18:56:48 -0000 Mailing-List: contact commits-help@airflow.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.incubator.apache.org Delivered-To: mailing list commits@airflow.incubator.apache.org Received: (qmail 94821 invoked by uid 99); 12 Apr 2017 18:56:48 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Apr 2017 18:56:48 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id A1617181059 for ; Wed, 12 Apr 2017 18:56:47 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -4.222 X-Spam-Level: X-Spam-Status: No, score=-4.222 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id s0lmW6O3Auts for ; Wed, 12 Apr 2017 18:56:45 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id 7F52D5FC5D for ; Wed, 12 Apr 2017 18:56:44 +0000 (UTC) Received: (qmail 94019 invoked by uid 99); 12 Apr 2017 18:56:42 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Apr 2017 18:56:42 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 7E884E9622; Wed, 12 Apr 2017 18:56:42 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: davydov@apache.org To: commits@airflow.incubator.apache.org Message-Id: <131a84372aff4c2aa47b1daf94a42c5c@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: incubator-airflow git commit: [AIRFLOW-1074] Don't count queued tasks for concurrency limits Date: Wed, 12 Apr 2017 18:56:42 +0000 (UTC) archived-at: Wed, 12 Apr 2017 18:56:49 -0000 Repository: incubator-airflow Updated Branches: refs/heads/master 6b1c327ee -> 8f9f5084b [AIRFLOW-1074] Don't count queued tasks for concurrency limits There may be orphaned tasks queued but not in a running dag run that will not cleared. We should not count these as they will interfere. I hate to do this, but I changed my mind on counting queued tasks. 1. Queued tasks that are actually queued generally get set to running pretty quickly. 2. Because of the worker-side check, we won't actually pass concurrency. I don't think the queued thing is a big deal because of this, I'm more worried about orphaned tasks that are in QUEUED state but not in a running dag_run (so they wont get reset) interfering with concurrency. There may be orphaned tasks queued but not in a running dag run that will not cleared. We should not count these as they will interfere. Closes #2221 from saguziel/aguziel-concurrency-2 Project: http://git-wip-us.apache.org/repos/asf/incubator-airflow/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-airflow/commit/8f9f5084 Tree: http://git-wip-us.apache.org/repos/asf/incubator-airflow/tree/8f9f5084 Diff: http://git-wip-us.apache.org/repos/asf/incubator-airflow/diff/8f9f5084 Branch: refs/heads/master Commit: 8f9f5084bfdc2aa4017fee12e22d2e94672765ba Parents: 6b1c327 Author: Alex Guziel Authored: Wed Apr 12 11:56:03 2017 -0700 Committer: Dan Davydov Committed: Wed Apr 12 11:56:06 2017 -0700 ---------------------------------------------------------------------- airflow/jobs.py | 3 ++- tests/jobs.py | 6 +++--- 2 files changed, 5 insertions(+), 4 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/8f9f5084/airflow/jobs.py ---------------------------------------------------------------------- diff --git a/airflow/jobs.py b/airflow/jobs.py index f031f6e..18cd82e 100644 --- a/airflow/jobs.py +++ b/airflow/jobs.py @@ -1064,11 +1064,12 @@ class SchedulerJob(BaseJob): dag_id = task_instance.dag_id if dag_id not in dag_id_to_possibly_running_task_count: + # TODO(saguziel): also check against QUEUED state, see AIRFLOW-1104 dag_id_to_possibly_running_task_count[dag_id] = \ DAG.get_num_task_instances( dag_id, simple_dag_bag.get_dag(dag_id).task_ids, - states=[State.RUNNING, State.QUEUED], + states=[State.RUNNING], session=session) current_task_concurrency = dag_id_to_possibly_running_task_count[dag_id] http://git-wip-us.apache.org/repos/asf/incubator-airflow/blob/8f9f5084/tests/jobs.py ---------------------------------------------------------------------- diff --git a/tests/jobs.py b/tests/jobs.py index e3caa5d..e99778a 100644 --- a/tests/jobs.py +++ b/tests/jobs.py @@ -504,14 +504,14 @@ class SchedulerJobTest(unittest.TestCase): ti1.refresh_from_db() ti2.refresh_from_db() ti1.state = State.RUNNING - ti2.state = State.QUEUED + ti2.state = State.RUNNING session.merge(ti1) session.merge(ti2) session.commit() self.assertEqual(State.RUNNING, dr1.state) self.assertEqual(2, DAG.get_num_task_instances(dag_id, dag.task_ids, - states=[State.RUNNING, State.QUEUED], session=session)) + states=[State.RUNNING], session=session)) # create second dag run dr2 = scheduler.create_dag_run(dag) @@ -538,7 +538,7 @@ class SchedulerJobTest(unittest.TestCase): self.assertEqual(3, DAG.get_num_task_instances(dag_id, dag.task_ids, states=[State.RUNNING, State.QUEUED], session=session)) self.assertEqual(State.RUNNING, ti1.state) - self.assertEqual(State.QUEUED, ti2.state) + self.assertEqual(State.RUNNING, ti2.state) six.assertCountEqual(self, [State.QUEUED, State.SCHEDULED], [ti3.state, ti4.state]) session.close()