airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yogesh (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AIRFLOW-462) Concurrent Scheduler Jobs pushing the same task to queue
Date Thu, 25 Aug 2016 18:31:21 GMT

     [ https://issues.apache.org/jira/browse/AIRFLOW-462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Yogesh updated AIRFLOW-462:
---------------------------
    Description: 
Hi,

We are using airflow version 1.7.0 and we tried to implement high availability for airflow
daemons in our production environment.

Detailed high availability approach:
-	Airflow running on two different machines with all the daemons(webserver, scheduler, execueor)
-	Single mysql db repository pointed by two schedulers
-	Replicated dag files in both the machines
-       Running Single Rabbitmq Instance as message broker


While doing so we came across below problem:

-	A particular task was sent to executor twice (two entries in message queue) by two different
schedulers. But, we see only single entry for the task instance in database which is correct.

We just checked out the code and found below fact:

-	before sending the task to executor it checks for task state in database and if its not
already QUEUED it pushes that task to queue

issue:

As there is no locking implemented on the task instance in the database and both the Scheduler
jobs are running so close that the second one might check for the status in the db before
the first one updates that to QUEUED.

We are not sure if in recent release this issue have been taken care of.

Would you please help with some appropriate approach so that the high availability can be
achieved.

Thanks
Yogesh

  was:
Hi,

We are using airflow version 1.7.0 and we tried to implement high availability for airflow
daemons in our production environment.

Detailed high availability approach:
-	Airflow running on two different machines with all the daemons(webserver, scheduler, execueor)
-	Single mysql db repository pointed by two schedulers
-	Replicated dag files in both the machines


While doing so we came across below problem:

-	A particular task was sent to executor twice (two entries in message queue) by two different
schedulers. But, we see only single entry for the task instance in database which is correct.

We just checked out the code and found below fact:

-	before sending the task to executor it checks for task state in database and if its not
already QUEUED it pushes that task to queue

issue:

As there is no locking implemented on the task instance in the database and both the Scheduler
jobs are running so close that the second one might check for the status in the db before
the first one updates that to QUEUED.

We are not sure if in recent release this issue have been taken care of.

Would you please help with some appropriate approach so that the high availability can be
achieved.

Thanks
Yogesh


> Concurrent Scheduler Jobs pushing the same task to queue
> --------------------------------------------------------
>
>                 Key: AIRFLOW-462
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-462
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: Airflow 1.7.0
>            Reporter: Yogesh
>            Priority: Blocker
>
> Hi,
> We are using airflow version 1.7.0 and we tried to implement high availability for airflow
daemons in our production environment.
> Detailed high availability approach:
> -	Airflow running on two different machines with all the daemons(webserver, scheduler,
execueor)
> -	Single mysql db repository pointed by two schedulers
> -	Replicated dag files in both the machines
> -       Running Single Rabbitmq Instance as message broker
> While doing so we came across below problem:
> -	A particular task was sent to executor twice (two entries in message queue) by two
different schedulers. But, we see only single entry for the task instance in database which
is correct.
> We just checked out the code and found below fact:
> -	before sending the task to executor it checks for task state in database and if its
not already QUEUED it pushes that task to queue
> issue:
> As there is no locking implemented on the task instance in the database and both the
Scheduler jobs are running so close that the second one might check for the status in the
db before the first one updates that to QUEUED.
> We are not sure if in recent release this issue have been taken care of.
> Would you please help with some appropriate approach so that the high availability can
be achieved.
> Thanks
> Yogesh



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message