airflow-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sunil Khaire <sunilkhair...@gmail.com>
Subject Re: Issues running only one active instance in a DAG
Date Fri, 02 Oct 2020 13:55:20 GMT
Hi Sandeep ,

Looks good , you can skip max_acitve_runs.

Thanks ,
Sunil K

On Fri, 2 Oct 2020 at 7:20 PM Sandeep Shetty <shettysandee@gmail.com> wrote:

> Hi Sunil,
>
> Can you please confirm if the below parameters should be at default
> argument or DAG level:
>
> max_active_runs=1
> 'depends_on_past': True
> wait_on_downstream  = True
>
> Regards
> Sandeep
>
> On Fri, Oct 2, 2020 at 9:39 AM Sunil Khaire <sunilkhaire17@gmail.com>
> wrote:
>
>> Please use wait_on_downstream True.
>>
>> This should fix the issue.
>>
>> Thanks ,
>> Sunil K
>>
>> On Fri, 2 Oct 2020 at 6:58 PM Sandeep Shetty <shettysandee@gmail.com>
>> wrote:
>>
>>> Hi Sunil,
>>>
>>> Let me add more details:
>>> Used Case: The DAG has multiple tasks and is scheduled to run every 5
>>> mins.
>>> Actual result: The DAG kicks off a 2nd run every time there is a failure
>>> in the 1st run. The status of 1st DAG is Failed but 2nd run kicks off after
>>> 5 mins.
>>> Expected Result: The DAG should not kick off a 2nd run unless the first
>>> run completes successfully.
>>>
>>> DAG Code:
>>>
>>> default_args = {
>>>
>>>     'owner': 'xxx',
>>>
>>>     'depends_on_past': True,
>>>
>>>     'start_date': datetime(2020, 6, 15, tzinfo=local_tz),
>>>
>>>     'email': NOTIFY_EMAIL,
>>>
>>>     'email_on_failure': True,
>>>
>>> #    'email_on_retry': True,
>>>
>>> #    'retries': 1,
>>>
>>>     'domain': 'Mediasupplychain'
>>>
>>> #    'retry_delay': timedelta(minutes=30)
>>>
>>> }
>>>
>>>
>>>
>>>
>>>
>>> dag = DAG(DAG_NAME,
>>>
>>>           default_args=default_args,
>>>
>>>           schedule_interval= '0 */3 * * *',
>>>
>>>           catchup=False,
>>>
>>>           max_active_runs=1)
>>>
>>>
>>> Airflow screenshot:
>>> [image: image.png]
>>>
>>> On Fri, Oct 2, 2020 at 9:11 AM Sunil Khaire <sunilkhaire17@gmail.com>
>>> wrote:
>>>
>>>> Hi , Sandeep ,
>>>>
>>>> Its not quite clear what you want. But if I understood correctly may be
>>>> you can try depend_on_past as True or max_active_runs at dag level.
>>>>
>>>>
>>>> Thanks ,
>>>> Sunil Khaire
>>>>
>>>> On Fri, 2 Oct 2020 at 5:32 PM Sandeep S <3005sandeep@gmail.com> wrote:
>>>>
>>>>> Hi All,
>>>>>
>>>>> I am having a production issue running only one instance of DAG at a
>>>>> time. If the DAG is running one instance, a 2nd instance does not kick
off.
>>>>> But if any of the task fails in the active DAG instance, the DAG gets
>>>>> marked failed but a 2nd instance kicks off after 5 mins(5 mins scheduled
>>>>> time for DAG.
>>>>>
>>>>> Please help.
>>>>>
>>>>> Regards
>>>>> Sandeep
>>>>>
>>>>> On Mon, Sep 28, 2020 at 1:18 PM Tavares Forby <tforby@qti.qualcomm.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I am having a few issues with Airflow and task instances greater
than
>>>>>> 750.  I am getting one consistent error and one error that happens
random
>>>>>> (understand, it's technically not random).
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Consistent error:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> [2020-09-25 12:28:01,703] {scheduler_job.py:237}
>>>>>>
>>>>>> WARNING - Killing PID 119970 [2020-09-25 12:29:17,110]
>>>>>> {scheduler_job.py:237} WARNING - Killing PID 121013 [2020-09-25
>>>>>> 12:29:17,110] {scheduler_job.py:237} WARNING - Killing PID 121013
>>>>>> [2020-09-25 12:30:12,171] {scheduler_job.py:237} WARNING - Killing
PID
>>>>>>
>>>>>> 123243
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Random error:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>  [2020-09-27
>>>>>>
>>>>>> 19:37:25,127] {scheduler_job.py:771} INFO - Examining DAG run <DagRun
>>>>>> tutorial_large_design_debug7 @ 2020-09-28 02:37:24+00:00:
>>>>>> manual__2020-09-28T02:37:24+00:00, externally triggered: True>
[2020-09-27
>>>>>> 19:37:26,749] {logging_mixin.py:112} INFO - [2020-09-27
>>>>>>
>>>>>> 19:37:26,749] {dagrun.py:408} INFO - (MySQLdb.
>>>>>>
>>>>>> *exceptions.IntegrityError) (1062, "Duplicate entry 'echo__a*-tutorial_large_design_debug7-2020-
>>>>>> 09-28 02:37:24.000000' for key 'PRIMARY'")
>>>>>>
>>>>>> [SQL: INSERT INTO task_instance (task_id, dag_id, execution_date,
>>>>>> start_date, end_date, duration, state, try_number, max_tries, hostname,
>>>>>> unixname, job_id, pool, pool_slots, queue, priority_weight, operator,
qu
>>>>>> eued_dttm, pid, executor_config) VALUES (%s,
>>>>>>
>>>>>> %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s,
>>>>>> %s, %s)]
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Please
>>>>>>
>>>>>> help! thanks!
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Mime
View raw message