airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rolf Schroeder (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-1642) An Alembic script not using scoped session causing deadlock
Date Fri, 02 Mar 2018 08:53:00 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16383360#comment-16383360
] 

Rolf Schroeder commented on AIRFLOW-1642:
-----------------------------------------

Hi,

the following workaround allows to use the Joy's patch on a one time basis:

 
{code:bash}
# Got to Airflow install dir
cd /path/to/venv/lib/python*/site-packages/airflow/migrations/versions
# Make a backup of the "faulty" revision
rsync -a cc1e65623dc7_add_max_tries_column_to_task_instance.py cc1e65623dc7_add_max_tries_column_to_task_instance.py.bak
# Add patch
sed -i 's/session = sessionmaker(bind=connection)/session = settings.Session()/' cc1e65623dc7_add_max_tries_column_to_task_instance.py
# Init db
airflow initdb
# Restore the revision
rsync -av cc1e65623dc7_add_max_tries_column_to_task_instance.py.bak cc1e65623dc7_add_max_tries_column_to_task_instance.py
{code}
 

This is obviously not how things should get fixed but a working solution until someone is
bold enough to actually fix the migration ;)

> An Alembic script not using scoped session causing deadlock
> -----------------------------------------------------------
>
>                 Key: AIRFLOW-1642
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1642
>             Project: Apache Airflow
>          Issue Type: Bug
>            Reporter: Joy Gao
>            Priority: Minor
>
> The bug I'm about to describe is a more of an obscure edge case, however I think it's
something still worth fixing.
> After upgrading to airflow 1.9, while running `airflow resetdb` on my local machine (with
mysql), I encountered a deadlock on the final alembic revision _d2ae31099d61 Increase text
size for MySQL (not relevant for other DBs' text types)_.
> The deadlock turned out to be caused by another earlier session that was created and
left open in revision _cc1e65623dc7 add max tries column to task instance_. Notably the code
below:
> {code}
> sessionmaker = sa.orm.sessionmaker()
> session = sessionmaker(bind=connection)
> dagbag = DagBag(settings.DAGS_FOLDER)
> {code}
> The session created here was not a `scoped_session`, so when the DAGs were being parsed
in line 3 above, one of the DAG files makes a direct call to the class method `Variable.get()`
to acquire an env variable, which makes a db query to the `variable` table, but raised a KeyError
as the env variable was non-existent, thus holding the lock to the `variable` table as a result
of that exception.
> Later on, the latter alembic script `_cc1e65623dc7` needs to alter the `Variable` table.
Instead of creating its own Session object, it attempts to reuse the same one as above. And
because of the exception, it waits indefinitely to acquire the lock on that table. 
> So the DAG file itself could have avoided the KeyError by providing a default value when
calling Variable.get(). However I think it would be a good idea to avoid using unscoped sessions
in general, as an exception could potentially occur in the future elsewhere.  The easiest
fix is replacing *session = sessionmaker(bind=connection)* with *session = settings.Session()*,
which is scoped. However, making a change on a migration script is going to make folks anxious.
> If anyone have any thoughts on this, let me know! Thanks :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message