Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 28132200D25 for ; Sun, 22 Oct 2017 23:04:06 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 266FA160BD7; Sun, 22 Oct 2017 21:04:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6AF4A1609E1 for ; Sun, 22 Oct 2017 23:04:05 +0200 (CEST) Received: (qmail 62767 invoked by uid 500); 22 Oct 2017 21:04:04 -0000 Mailing-List: contact commits-help@airflow.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.incubator.apache.org Delivered-To: mailing list commits@airflow.incubator.apache.org Received: (qmail 62757 invoked by uid 99); 22 Oct 2017 21:04:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 22 Oct 2017 21:04:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id A55481A030D for ; Sun, 22 Oct 2017 21:04:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id iFrCueuLRRh9 for ; Sun, 22 Oct 2017 21:04:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 4E07E5FBE6 for ; Sun, 22 Oct 2017 21:04:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id F398CE002C for ; Sun, 22 Oct 2017 21:04:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 2B1A124390 for ; Sun, 22 Oct 2017 21:04:00 +0000 (UTC) Date: Sun, 22 Oct 2017 21:04:00 +0000 (UTC) From: "Bolke de Bruin (JIRA)" To: commits@airflow.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (AIRFLOW-1641) Task gets stuck in queued state MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sun, 22 Oct 2017 21:04:06 -0000 [ https://issues.apache.org/jira/browse/AIRFLOW-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214452#comment-16214452 ] Bolke de Bruin commented on AIRFLOW-1641: ----------------------------------------- Cause I need the feedback: I don't experience the issue myself, it is pretty core in the scheduler what needs to be updated (the current patch doesn't work correctly), next to that the automated testing helps to catch some errors (tests don't pass currently). Also in addition to what is written here you can also limit parallelism to workaround the issue. Staying in QUEUED means the task instance can't even start. Most likely due to load or memory issues. This should be dealt with correctly (that is the real issue Im trying the solve in de patch), but it will fail your task anyways. > Task gets stuck in queued state > ------------------------------- > > Key: AIRFLOW-1641 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1641 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler > Affects Versions: 1.8.0 > Environment: Linux > Reporter: Mas > Assignee: Bolke de Bruin > Labels: queued, scheduler, stuck, task > Fix For: 1.9.0 > > > Hello, > I have one dag with ~20 tasks. > The dags runs daily and some tasks can sometime last for hours, depending on the processed data behind. > There are some interactions with AWS and a remote DB. > I only use LocalExecutor. > What this issue is about, is the fact that sometime (randomly, and without any clear reason) one of the tasks (here also, it is random) gets stuck in "queued" state and never starts running. > The manual workaround is to restart the task manually by clearing it. > Does anyone have ideas about the issue behind, and how to avoid it for the future? > Thanks in advance for your help. > PS: other people are facing the same behaviour: [link|https://stackoverflow.com/questions/45853013/airflow-tasks-get-stuck-at-queued-status-and-never-gets-running] -- This message was sent by Atlassian JIRA (v6.4.14#64029)