From dev-return-5625-archive-asf-public=cust-asf.ponee.io@airflow.incubator.apache.org Fri Jul 20 10:04:44 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 66701180663 for ; Fri, 20 Jul 2018 10:04:44 +0200 (CEST) Received: (qmail 80268 invoked by uid 500); 20 Jul 2018 08:04:43 -0000 Mailing-List: contact dev-help@airflow.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.incubator.apache.org Delivered-To: mailing list dev@airflow.incubator.apache.org Received: (qmail 80252 invoked by uid 99); 20 Jul 2018 08:04:42 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Jul 2018 08:04:42 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 39D34CC39D for ; Fri, 20 Jul 2018 08:04:42 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.638 X-Spam-Level: ** X-Spam-Status: No, score=2.638 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, KAM_NUMSUBJECT=0.5, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id d6wYyxZ9iuUz for ; Fri, 20 Jul 2018 08:04:41 +0000 (UTC) Received: from mail-oi0-f44.google.com (mail-oi0-f44.google.com [209.85.218.44]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 2E0F55F3B4 for ; Fri, 20 Jul 2018 08:04:41 +0000 (UTC) Received: by mail-oi0-f44.google.com with SMTP id y207-v6so19820959oie.13 for ; Fri, 20 Jul 2018 01:04:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=RpuUWC68jo/JRD20KOBiwDC/lgurm/9EhM2XYfSzqTw=; b=L5QfJaLH9TTiiJrWAgviy2nsQdGCX+S3DYa77lMbBPYEqN6+lygkPH762wytsdo0Mb CeicYRT35d8zprAE0ChScZppQE0s0wn+0rUZqIK1f6FWUCjsZIOp1iFLcN5Fm/yflf4N TeJg50ZgNRG1rseb/wqLJGdZ89Kr/YNluYH4nQbyERn/4xf39OyVCikdyRxLC8nunAud BI3M/3S88utUk4+W2GLhgqbl53lS/qmTg9PJuPPpPh5HG15GdXhvjmxTxd+DU2sng5uS IqV1B+1dwnOMJPRITTUcMACwLvkPbfSdeSSyv9GhG+PgOcs06OPgTev7zSzoGWCHwAQU 0zuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=RpuUWC68jo/JRD20KOBiwDC/lgurm/9EhM2XYfSzqTw=; b=F+sTXiTlTdxnx/gjoQSfYg6VKHCqoTEFYDZmmX5e8CIeRx1arpI3nLN+RNNqrB/XkB Dp52Q6vgfb1WwgBL0WYvWodPVS33lDnnxllE22I1yOeb0hCDluwthJVvNNmdwc4UnBkr QaKeqCbiKK+ENFSOOuLx4LAV22hiiQ/wawIGTbdVjquOuuEciWK8WTAHuKkGDrvA15NH 77B13jzCXf+NV/jbvIY70qDGbQxNJ8gK+9UpL55fOCmN4Mn3KJBsHHPbS2FFIzR9Rb5/ RnyKnCA/JGS6ZSzklMa5Dgt1ofRHmmK5oeuWFJUWdc5zqVCsjx/Sx3ScIaR7AWP6nRUt b9+Q== X-Gm-Message-State: AOUpUlGfNG/RHu+dogQlT5cwpXEtrOc3HNhzAwOWupTdNp9zirbFn2++ tpiMFuj1+h7uY1ZOX2in5OeTl3cjwmgKDoEIMVZvyiKP X-Google-Smtp-Source: AAOMgpdm3+41Q7NAgQx28sXyGU3mZCEyyd+imYIJdFchCRtGxn06Y6eZQI2IsRsO4Y6zZqWP4GCVxhcrGppdZUnDaaY= X-Received: by 2002:aca:b782:: with SMTP id h124-v6mr1100651oif.7.1532073880162; Fri, 20 Jul 2018 01:04:40 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Ruiqin Yang Date: Fri, 20 Jul 2018 01:04:29 -0700 Message-ID: Subject: Re: Failover in apache 1.8.0 To: dev@airflow.incubator.apache.org Content-Type: multipart/alternative; boundary="0000000000006593a0057169bf69" --0000000000006593a0057169bf69 Content-Type: text/plain; charset="UTF-8" Hi Shubham, Worker running actual airflow task will regularly heartbeat, which updates the task instance entry in the DB. Scheduler will kill task instance w/o heartbeat for a long time, called zombie tasks, and if the task has retry left it will try to reschedule it( given all trigger rules are satisfied). If workers have heavy load, the scheduler will still be able to schedule tasks( putting tasks into worker queue). And you will just wait for workers to pick up the tasks from the queue. If the tasks never get picked up and the scheduler lost track of it, their state will be reset to NONE when scheduler restarts, they are called orphan tasks. FYI, inside Airbnb, Alex Guziel( @saguziel ) has a patch that will requeue tasks if they don't get picked up by workers for a long time and he has plan to open source it. Cheers, Kevin Y On Fri, Jul 20, 2018 at 12:40 AM Shubham Gupta wrote: > Hi, > > I would like to know what happens if a Celery worker running one of the > tasks crashes. Will the job be rescheduled? > > Also, if the scheduler is not able to schedule a task on time due to heavy > load on all workers, what will happen to the task? > > Regards > Shubham Gupta > --0000000000006593a0057169bf69--