airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Riccomini (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-139) Executing VACUUM with PostgresOperator
Date Thu, 19 May 2016 15:51:12 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15291350#comment-15291350
] 

Chris Riccomini commented on AIRFLOW-139:
-----------------------------------------

It seems useful to be able to execute VACUUM calls from Airflow.

It looks like DbApiHook always runs in a transaction, though:

{code}
    def run(self, sql, autocommit=False, parameters=None):
        conn = self.get_conn()
        if isinstance(sql, basestring):
            sql = [sql]

        if self.supports_autocommit:
           self.set_autocommit(conn, autocommit)

        cur = conn.cursor()
        for s in sql:
            logging.info(s)
            if parameters is not None:
                cur.execute(s, parameters)
            else:
                cur.execute(s)
        cur.close()
        conn.commit()
        conn.close()
{code}

> Executing VACUUM with PostgresOperator
> --------------------------------------
>
>                 Key: AIRFLOW-139
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-139
>             Project: Apache Airflow
>          Issue Type: Bug
>    Affects Versions: Airflow 1.7.0
>            Reporter: Rafael
>
> Dear Airflow Maintainers,
> h1. Environment
> * Airflow version: *v1.7.0*
> * Airflow components: *PostgresOperator*
> * Python Version: *Python 3.5.1*
> * Operating System: *15.4.0 Darwin*
> h1. Description of Issue
> I am trying to execute a `VACUUM` command as part of DAG with the `PostgresOperator`,
which fails with the following error:
> {quote}
> [2016-05-14 16:14:01,849] {__init__.py:36} INFO - Using executor SequentialExecutor
> Traceback (most recent call last):
>   File "/usr/local/bin/airflow", line 15, in <module>
>     args.func(args)
>   File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/airflow/bin/cli.py",
line 203, in run
>     pool=args.pool,
>   File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/airflow/models.py",
line 1067, in run
>     result = task_copy.execute(context=context)
>   File "/usr/local/lib/python3.5/site-packages/airflow/operators/postgres_operator.py",
line 39, in execute
>     self.hook.run(self.sql, self.autocommit, parameters=self.parameters)
>   File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/airflow/hooks/dbapi_hook.py",
line 109, in run
>     cur.execute(s)
> psycopg2.InternalError: VACUUM cannot run inside a transaction block
> {quote}
> I could create a small python script that performs the operation, as explained in [this
stackoverflow entry](http://stackoverflow.com/questions/1017463/postgresql-how-to-run-vacuum-from-code-outside-transaction-block).
However, I would like to know first if the `VACUUM` command should be supported by the `PostgresOperator`.
> h1. Reproducing the Issue
> The operator can be declared as follows:
> {quote}
> conn = ('postgres_default')
> t4 = PostgresOperator(
>     task_id='vacuum',
>     postgres_conn_id=conn,
>     sql=("VACUUM public.table"),
>     dag=dag
>     )
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message