From commits-return-85382-archive-asf-public=cust-asf.ponee.io@airflow.apache.org Wed Jan 1 10:39:05 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 675AC180630 for ; Wed, 1 Jan 2020 11:39:05 +0100 (CET) Received: (qmail 32650 invoked by uid 500); 1 Jan 2020 10:39:04 -0000 Mailing-List: contact commits-help@airflow.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.apache.org Delivered-To: mailing list commits@airflow.apache.org Received: (qmail 32641 invoked by uid 99); 1 Jan 2020 10:39:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Jan 2020 10:39:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 155EC1A32BE for ; Wed, 1 Jan 2020 10:39:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -112.24 X-Spam-Level: X-Spam-Status: No, score=-112.24 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_DMARC_STATUS=0.01, KAM_INFOUSMEBIZ=0.75, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id Wrfvn9aQtkOT for ; Wed, 1 Jan 2020 10:39:02 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=207.244.88.153; helo=mail.apache.org; envelope-from=jira@apache.org; receiver= Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with SMTP id 7BE5EBC6E3 for ; Wed, 1 Jan 2020 10:39:02 +0000 (UTC) Received: (qmail 32609 invoked by uid 99); 1 Jan 2020 10:39:02 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Jan 2020 10:39:02 +0000 Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 385E1E2E12 for ; Wed, 1 Jan 2020 10:39:01 +0000 (UTC) Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 5F78878034B for ; Wed, 1 Jan 2020 10:39:00 +0000 (UTC) Date: Wed, 1 Jan 2020 10:39:00 +0000 (UTC) From: "Jarek Potiuk (Jira)" To: commits@airflow.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Reopened] (AIRFLOW-2511) Subdag failed by scheduler deadlock MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/AIRFLOW-2511?page=3Dcom.atlass= ian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jarek Potiuk reopened AIRFLOW-2511: ----------------------------------- > Subdag failed by scheduler deadlock > ----------------------------------- > > Key: AIRFLOW-2511 > URL: https://issues.apache.org/jira/browse/AIRFLOW-2511 > Project: Apache Airflow > Issue Type: Bug > Affects Versions: 1.9.0 > Reporter: Yohei Shimomae > Assignee: lufeng > Priority: Major > Fix For: 1.10.3 > > > I am using subdag and sometimes main dag marked failed because of the fol= lowing error. In this case, tasks in the subdag stopped. > {code:java} > hourly_dag =3D DAG( > hourly_dag_name, > default_args=3Ddag_default_args, > params=3Ddag_custom_params, > schedule_interval=3Dconfig_values.hourly_job_interval, > max_active_runs=3D2) > hourly_subdag =3D SubDagOperator( > task_id=3D's3_to_hive', > subdag=3DLoadFromS3ToHive( > hourly_dag, > 's3_to_hive'), > dag=3Dhourly_dag) > {code} > I got this error in main dag. bug in scheduler? > {code:java} > [2018-05-22 21:52:19,683] {models.py:1595} ERROR - This Session's transac= tion has been rolled back due to a previous exception during flush. To begi= n a new transaction with this Session, first issue Session.rollback(). Orig= inal exception was: (_mysql_exceptions.OperationalError) (1213, 'Deadlock f= ound when trying to get lock; try restarting transaction') [SQL: 'UPDATE ta= sk_instance SET state=3D%s WHERE task_instance.task_id =3D %s AND task_inst= ance.dag_id =3D %s AND task_instance.execution_date =3D %s'] [parameters: (= 'queued', 'transfer_from_tmp_table_into_cleaned_table', 'rfid_warehouse_car= ton_wh_g_dl_dwh_csv_uqjp_1h.s3_to_hive', datetime.datetime(2018, 5, 7, 5, 2= ))] (Background on this error at: http://sqlalche.me/e/e3q8) > Traceback (most recent call last): > sqlalchemy.exc.InvalidRequestError: This Session's transaction has been r= olled back due to a previous exception during flush. To begin a new transac= tion with this Session, first issue Session.rollback(). Original exception = was: (_mysql_exceptions.OperationalError) (1213, 'Deadlock found when tryin= g to get lock; try restarting transaction') [SQL: 'UPDATE task_instance SET= state=3D%s WHERE task_instance.task_id =3D %s AND task_instance.dag_id =3D= %s AND task_instance.execution_date =3D %s'] [parameters: ('queued', 'tran= sfer_from_tmp_table_into_cleaned_table', 'rfid_warehouse_carton_wh_g_dl_dwh= _csv_uqjp_1h.s3_to_hive', datetime.datetime(2018, 5, 7, 5, 2))] (Background= on this error at: http://sqlalche.me/e/e3q8) > [2018-05-22 21:52:19,687] {models.py:1624} INFO - Marking task as FAILED. > [2018-05-22 21:52:19,688] {base_task_runner.py:98} INFO - Subtask: [2018-= 05-22 21:52:19,688] {slack_hook.py:143} INFO - Message is prepared:=20 > [2018-05-22 21:52:19,688] {base_task_runner.py:98} INFO - Subtask: {"atta= chments": [{"color": "danger", "text": "", "fields": [{"title": "DAG", "val= ue": "", "short": true}, {"title": "Owne= r", "value": "airflow", "short": true}, {"title": "Task", "value": "s3_to_h= ive", "short": false}, {"title": "Status", "value": "FAILED", "short": fals= e}, {"title": "Execution Time", "value": "2018-05-07T05:02:00", "short": tr= ue}, {"title": "Duration", "value": "826.305929", "short": true}, {"value":= "", "short": f= alse}]}]} > [2018-05-22 21:52:19,688] {models.py:1638} ERROR - Failed at executing ca= llback > [2018-05-22 21:52:19,688] {models.py:1639} ERROR - This Session's transac= tion has been rolled back due to a previous exception during flush. To begi= n a new transaction with this Session, first issue Session.rollback(). Orig= inal exception was: (_mysql_exceptions.OperationalError) (1213, 'Deadlock f= ound when trying to get lock; try restarting transaction') [SQL: 'UPDATE ta= sk_instance SET state=3D%s WHERE task_instance.task_id =3D %s AND task_inst= ance.dag_id =3D %s AND task_instance.execution_date =3D %s'] [parameters: (= 'queued', 'transfer_from_tmp_table_into_cleaned_table', 'rfid_warehouse_car= ton_wh_g_dl_dwh_csv_uqjp_1h.s3_to_hive', datetime.datetime(2018, 5, 7, 5, 2= ))] (Background on this error at: http://sqlalche.me/e/e3q8) > {code} > =C2=A0 -- This message was sent by Atlassian Jira (v8.3.4#803005)