airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Lamblin [Data Science & Platform Center]" <lamb...@coupang.com>
Subject Re: Setting up ETLs running Redshift queries
Date Mon, 23 Oct 2017 05:38:57 GMT
I think that you're overlooking that for #1 the sql field can be either a
query string or a file name endind in .sql for which the pre_execute step
will apply Jinja2 templating allowing you to set variable accordingly (EG
if your task has a param = {'some': 'thing'} dictionary the query could
include {{ params.some }}.
If you are looking to use the Hook directly, just be sure that your task
extends the PostgresOperator or if it's an extension to the BaseOperator
apply the same two lines in the class:
template_fields = ('sql',) template_ext = ('.sql',)
In short I recommend the #1 approach with templating and the macros
available in templating.
-Daniel

On Fri, Oct 20, 2017 at 10:24 AM, Veeranagouda Mukkanagoudar <
mukkanagoudar@gmail.com> wrote:

> I am new to AirFlow, and need help with setting up how to configure the
> ETLs running Redshift queries .
>
> The approaches i am know of,
>
> *1. Use postgress operator/hook: *
> Parse the query from file, and run via hook. Use xCom to pass/set the
> variables across tasks
>
> *2. Bash Operator:*
> Use this to invoke the PSQL CLI, since redshift doesnt have concept of
> variables, the queries need to be dynamically generated string passed.
>
> Both seem to be bit hard to adopt, and any other approach or options ?
>
> Thanks
> Veera
>



-- 
-Daniel Lamblin

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message