From dev-return-7882-archive-asf-public=cust-asf.ponee.io@airflow.apache.org Thu Mar 21 08:15:44 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id CDD5B180621 for ; Thu, 21 Mar 2019 09:15:43 +0100 (CET) Received: (qmail 70777 invoked by uid 500); 21 Mar 2019 08:15:42 -0000 Mailing-List: contact dev-help@airflow.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airflow.apache.org Delivered-To: mailing list dev@airflow.apache.org Received: (qmail 70760 invoked by uid 99); 21 Mar 2019 08:15:41 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Mar 2019 08:15:41 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 59719C0A3A for ; Thu, 21 Mar 2019 08:15:41 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.049 X-Spam-Level: ** X-Spam-Status: No, score=2.049 tagged_above=-999 required=6.31 tests=[DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, KAM_LOTSOFHASH=0.25, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id dss7rcaBOC6x for ; Thu, 21 Mar 2019 08:15:39 +0000 (UTC) Received: from mail-yw1-f65.google.com (mail-yw1-f65.google.com [209.85.161.65]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id F39085FAFD for ; Thu, 21 Mar 2019 08:15:38 +0000 (UTC) Received: by mail-yw1-f65.google.com with SMTP id z191so4147018ywa.6 for ; Thu, 21 Mar 2019 01:15:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=gcZ6GAde7MUZgl8rQ/QwrcuvuJESMxo8e0L+/84T0AE=; b=tMT6B0ZnpCN61Gnn419y6dFLGG92YReo2ZsEz7IWiSMQJiYT+X/WtGWcZTSQopqLQj zOjntnoPM73nhWbtioD+yL1kXVPnJKuvL7P3+5YbZEPJ5TwXiuLZ87Whv7LDI3JXFH5s +2biOS6FV4W4upOl4fmPPz0AcA7d7Lf96XSfBGU/+eI1SdgO6GCfgV/Ec/zYBKLqGxH6 CbX9x2Pk9coq2OwQwE6rrWsQ7neOYY6QwrzK0M9W3I0Tn6PDwfycPjtzmtuadRflqGW9 Qyy8HnX+ro2EbJBVTQrxtybw5asbgGSOYrMB31868ZfNBXDYe5mmj9woK8rI5PUXBfbW ZZYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=gcZ6GAde7MUZgl8rQ/QwrcuvuJESMxo8e0L+/84T0AE=; b=nDg0hQ5+vgZeLfuQDLPko1Lc/WkyqOUpri9HBE8/ilcMEpDLACPEyZqDPp6VSPsXi2 pZZ+obWNJQU2ZNL+ZcqUJX3ugrQVZ3auzyKJKhdnMRCBBvGPZPRu3kWLPYw+9d5w3f5i OU5qwWxRgXMTu0hxPCMJdIHg9ZltGEBYMeJmn6TKc/SMR3GUQyCNaoI7vfb9aXpfgCdc P29ZhCMegznCJfPOqL9I1sMrc1mKDFb+qBgdzWVy4Jm8K1MudigI15ybF1b2uUm0zuj3 bUSCVEGpH9l17ndszvx1NJYB1EdzO2tisVsZUUZcPQ18dDcEWaOTS9weuM8biI41FDka RIXA== X-Gm-Message-State: APjAAAUmCjPV1pb7AGSwlkPIU1H38C4N5lIiL0WSevaHFTVehcau5p6e rARGROgNF3UT1Kp4bngR+LIbLqoaM9+NxosA3KIVNGY3 X-Google-Smtp-Source: APXvYqwVBfdd5rZn2ooeE2qc9/1ek82uDClP8WkVKqgEyPFb+Oa/Z1XUZTtTyUTuPj0HH8BAKSGhyE2YIgbB2jpA6ak= X-Received: by 2002:a25:bf88:: with SMTP id l8mr1765562ybk.401.1553156137428; Thu, 21 Mar 2019 01:15:37 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Flo Rance Date: Thu, 21 Mar 2019 09:15:26 +0100 Message-ID: Subject: Re: PostgreSQL hook To: dev@airflow.apache.org Content-Type: multipart/alternative; boundary="000000000000da2c6c05849657b4" --000000000000da2c6c05849657b4 Content-Type: text/plain; charset="UTF-8" Hi, Thank you for this explanation. If I summarize, I'll have to write a file_to_postgres Operator, with pg_preoperator and pg_postoperator parameters. Just a simple question: Where should I add and store this Operator in the airflow ecosystem ? Regards, Flo On Wed, Mar 20, 2019 at 5:05 PM jiajie zhong wrote: > Hi, Flo. I am not good at PG, but I find code in out master branch > > https://github.com/apache/airflow/blob/dd8ce6a7123170ef4b0f719fb773527b17d9348c/airflow/hooks/postgres_hook.py#L63-L89 > < > https://github.com/apache/airflow/blob/dd8ce6a7123170ef4b0f719fb773527b17d9348c/airflow/hooks/postgres_hook.py#L63-L83 > > > I think maybe this is what you looking for. > > And, we recommend use Operator to do something instead of Hook. But in we > have no "local-file-pg-operator". maybe you should and this function by > youself. > > BWT, I think > BEGIN; > > CREATE TEMP TABLE catalog_tmp ON COMMIT DROP AS SELECT * FROM catalog WITH > NO DATA; > > \COPY catalog_tmp (...) FROM '/home/cat/catalog.csv' WITH DELIMITER ';' CSV > ENCODING 'LATIN1' NULL ''; > > DELETE FROM catalog_tmp WHERE code IS NULL; > ... > COMMIT; > what you said is a transaction, and so do in a single operator. you could > write code just like > > https://github.com/apache/airflow/blob/dd8ce6a7123170ef4b0f719fb773527b17d9348c/airflow/operators/hive_to_mysql.py#L70-L71 > [https://avatars3.githubusercontent.com/u/47359?s=400&v=4]< > https://github.com/apache/airflow/blob/dd8ce6a7123170ef4b0f719fb773527b17d9348c/airflow/operators/hive_to_mysql.py#L70-L71 > > > > apache/airflow< > https://github.com/apache/airflow/blob/dd8ce6a7123170ef4b0f719fb773527b17d9348c/airflow/operators/hive_to_mysql.py#L70-L71 > > > Apache Airflow. Contribute to apache/airflow development by creating an > account on GitHub. > github.com > > that have "pg_preoperator" and "pg_postoperator" parameter, but extract > data from local file instand of hive. > > ________________________________ > From: Flo Rance > Sent: Wednesday, March 20, 2019 23:30 > To: dev@airflow.apache.org > Subject: PostgreSQL hook > > Hi, > > I don't know if it's the correct place to ask for that. > > I'm trying to implement one of my cronjob using airflow. One of the tasks > is to load files in a temporary table and then update another table in a > postgres db. > For that, I was previously using a sql script like that: > > BEGIN; > > CREATE TEMP TABLE catalog_tmp ON COMMIT DROP AS SELECT * FROM catalog WITH > NO DATA; > > \COPY catalog_tmp (...) FROM '/home/cat/catalog.csv' WITH DELIMITER ';' CSV > ENCODING 'LATIN1' NULL ''; > > DELETE FROM catalog_tmp WHERE code IS NULL; > ... > COMMIT; > > I would like to replace \copy with the copy_expert from postgresql hook. Is > that realistic ? > If yes, how can I combine a sql script and that hook in one task ? > > Regards, > Flo > --000000000000da2c6c05849657b4--