From dev-return-6218-archive-asf-public=cust-asf.ponee.io@airflow.incubator.apache.org Wed Aug 22 22:00:04 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id B2B21180662 for ; Wed, 22 Aug 2018 22:00:03 +0200 (CEST) Received: (qmail 33904 invoked by uid 500); 22 Aug 2018 20:00:02 -0000 Mailing-List: contact dev-help@airflow.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.incubator.apache.org Delivered-To: mailing list dev@airflow.incubator.apache.org Received: (qmail 33852 invoked by uid 99); 22 Aug 2018 20:00:02 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Aug 2018 20:00:02 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 9440CC00D6 for ; Wed, 22 Aug 2018 20:00:01 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.639 X-Spam-Level: ** X-Spam-Status: No, score=2.639 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, KAM_INFOUSMEBIZ=0.75, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_DKIMWL_WL_HIGH=-0.01] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=twitter.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id HG6QBjejwN7W for ; Wed, 22 Aug 2018 19:59:59 +0000 (UTC) Received: from mail-io0-f172.google.com (mail-io0-f172.google.com [209.85.223.172]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 27ABC5F36D for ; Wed, 22 Aug 2018 19:59:59 +0000 (UTC) Received: by mail-io0-f172.google.com with SMTP id l7-v6so2437799iok.6 for ; Wed, 22 Aug 2018 12:59:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=CUyHaJ/+oSkzsFrTv2TJNttFFE1FdQ0eihYCgbyslT8=; b=IUnIWZjj3blSTrqn8Z+ngXj6AeJGBhYNtIN/kpnDjq2+0xVbmf35PWHG+Xkr3ZFCE6 DFQJaQYRg4JM3J4h1fKq4bt3iafWSM+K/zulhoKBF6cUC7Xbd7oIkS9TUwueKsv16GXc m96zYFpGtLFvrjB202MvmAOTFuJm3EuGdjd2VHkaXJAjJ6nxROw89Y9KHOOtI6FdkmTI eKHng5aXMn9ua1Q4nZ2N7OUEOhMq+GW0c79pdUAoLbpZHRDYHSA8gp+NjG9d+Weaf3Ev Mc57jTwzjPIU0RAqp/MiE1cRYICN1ukfivBdu6q3v0K/JLO1toPiNXoQA9hRBv8JiNoo KKfw== X-Gm-Message-State: AOUpUlGovFGiY9LvQ9Nom8BU0Ylt63bTiaZ7H5CisTeJos8XWx+ZORzP 4dFYN+iXAMxJXmGX/G3ZfQ48wk7yWgQ/hcxWH9OGu9UK X-Google-Smtp-Source: AA+uWPzzq7xO+tc0BecgP/yh7zNt2UM2vPQK2QqSN8NHA3xkwSuUjSP79GaKXnJEiqAKbR1EzZEgWVHTB/nFO6Pm72o= X-Received: by 2002:a6b:d92:: with SMTP id 140-v6mr50113607ion.165.1534967997706; Wed, 22 Aug 2018 12:59:57 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Dan Davydov Date: Wed, 22 Aug 2018 15:59:21 -0400 Message-ID: Subject: Re: Why not mark inactive DAGs in the main scheduler loop? To: dev@airflow.incubator.apache.org Content-Type: multipart/alternative; boundary="0000000000003f078605740b968b" --0000000000003f078605740b968b Content-Type: text/plain; charset="UTF-8" Agreed on delegation to a subprocess but I think that can come as part of a larger redesign (maybe along with uploading DAG import errors etc). The query should be quite fast so it should not have a significant impact on the Scheduler times. On Wed, Aug 22, 2018 at 3:52 PM Maxime Beauchemin < maximebeauchemin@gmail.com> wrote: > I'd rather the scheduler delegate that to one of the minions (subprocess) > if possible. We should keep everything we can off the main thread. > > BTW I've been speaking about renaming the scheduler to "supervisor" for a > while now. While renaming may be a bit tricky (updating all references in > the code), we should think of the scheduler as more of a supervisor as it > takes on all sorts of supervision-related tasks. > > Tangent: we need to start thinking about allowing for a distributed > scheduler too, and I'm thinking we need to be careful around the tasks that > shouldn't be parallelized (this may or may not be one of them). We'll need > to do very basic leader election and taking/releasing locks while running > these tasks. I'm thinking we can just set flags in the database to do that. > > Max > > On Wed, Aug 22, 2018 at 12:19 PM Taylor Edmiston > wrote: > > > I'm not super familiar with this part of the scheduler. What exactly are > > the implications of doing this mid-loop vs at scheduler termination? > > Is there a use case where DAGs hit this besides having been deleted? > > > > The deactivate_stale_dags call doesn't appear to be super expensive or > > anything like that. > > > > This seems like a reasonable idea to me. > > > > *Taylor Edmiston* > > Blog | CV > > | LinkedIn > > | AngelList > > | Stack Overflow > > > > > > > > > > On Wed, Aug 22, 2018 at 2:32 PM Dan Davydov > > > wrote: > > > > > I see some PRs creating endpoints to delete DAGs and other things > related > > > to manually deleting DAGs from the DB, but is there a good reason why > we > > > can't just move the deactivating DAG logic into the main scheduler > loop? > > > > > > The scheduler already has some code like this, but it only runs when > the > > > Scheduler terminates: > > > if all_files_processed: > > > self.log.info( > > > "Deactivating DAGs that haven't been touched since %s", > > > execute_start_time.isoformat() > > > ) > > > models.DAG.deactivate_stale_dags(execute_start_time) > > > > > > --0000000000003f078605740b968b--