airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flo Rance <troura...@gmail.com>
Subject Re: PostgreSQL hook
Date Thu, 21 Mar 2019 08:15:26 GMT
Hi,

Thank you for this explanation. If I summarize, I'll have to write a
file_to_postgres Operator, with pg_preoperator and pg_postoperator
parameters.

Just a simple question: Where should I add and store this Operator in the
airflow ecosystem ?

Regards,
Flo

On Wed, Mar 20, 2019 at 5:05 PM jiajie zhong <zhongjiajie955@hotmail.com>
wrote:

> Hi, Flo. I am not good at PG, but I find code in out master branch
>
> https://github.com/apache/airflow/blob/dd8ce6a7123170ef4b0f719fb773527b17d9348c/airflow/hooks/postgres_hook.py#L63-L89
> <
> https://github.com/apache/airflow/blob/dd8ce6a7123170ef4b0f719fb773527b17d9348c/airflow/hooks/postgres_hook.py#L63-L83
> >
> I think maybe this is what you looking for.
>
> And, we recommend use Operator to do something instead of Hook. But in we
> have no "local-file-pg-operator". maybe you should and this function by
> youself.
>
> BWT, I think
> BEGIN;
>
> CREATE TEMP TABLE catalog_tmp ON COMMIT DROP AS SELECT * FROM catalog WITH
> NO DATA;
>
> \COPY catalog_tmp (...) FROM '/home/cat/catalog.csv' WITH DELIMITER ';' CSV
> ENCODING 'LATIN1' NULL '';
>
> DELETE FROM catalog_tmp WHERE code IS NULL;
> ...
> COMMIT;
> what you said is a transaction, and so do in a single operator. you could
> write code just like
>
> https://github.com/apache/airflow/blob/dd8ce6a7123170ef4b0f719fb773527b17d9348c/airflow/operators/hive_to_mysql.py#L70-L71
> [https://avatars3.githubusercontent.com/u/47359?s=400&v=4]<
> https://github.com/apache/airflow/blob/dd8ce6a7123170ef4b0f719fb773527b17d9348c/airflow/operators/hive_to_mysql.py#L70-L71
> >
>
> apache/airflow<
> https://github.com/apache/airflow/blob/dd8ce6a7123170ef4b0f719fb773527b17d9348c/airflow/operators/hive_to_mysql.py#L70-L71
> >
> Apache Airflow. Contribute to apache/airflow development by creating an
> account on GitHub.
> github.com
>
> that have "pg_preoperator" and "pg_postoperator" parameter, but extract
> data from local file instand of hive.
>
> ________________________________
> From: Flo Rance <trourance@gmail.com>
> Sent: Wednesday, March 20, 2019 23:30
> To: dev@airflow.apache.org
> Subject: PostgreSQL hook
>
> Hi,
>
> I don't know if it's the correct place to ask for that.
>
> I'm trying to implement one of my cronjob using airflow. One of the tasks
> is to load files in a temporary table and then update another table in a
> postgres db.
> For that, I was previously using a sql script like that:
>
> BEGIN;
>
> CREATE TEMP TABLE catalog_tmp ON COMMIT DROP AS SELECT * FROM catalog WITH
> NO DATA;
>
> \COPY catalog_tmp (...) FROM '/home/cat/catalog.csv' WITH DELIMITER ';' CSV
> ENCODING 'LATIN1' NULL '';
>
> DELETE FROM catalog_tmp WHERE code IS NULL;
> ...
> COMMIT;
>
> I would like to replace \copy with the copy_expert from postgresql hook. Is
> that realistic ?
> If yes, how can I combine a sql script and that hook in one task ?
>
> Regards,
> Flo
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message