airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From alireza.khoshkbari@gmail.com <alireza.khoshkb...@gmail.com>
Subject How Airflow import modules as it executes the tasks
Date Wed, 16 May 2018 01:21:16 GMT
To start off, here is my project structure:
├── dags
│   ├── __init__.py
│   ├── core
│   │   ├── __init__.py
│   │   ├── operators
│   │   │   ├── __init__.py
│   │   │   ├── first_operator.py
│   │   └── util
│   │       ├── __init__.py
│   │       ├── db.py
│   ├── my_dag.py

Here is the versions and details of the airflow docker setup:

In my dag in different tasks I'm connecting to db (not Airflow db). I've setup db connection
pooling,  I expected that my db.py would be be loaded once across the DagRun. However, in
the log I can see that each task imports the module and new db connections made by each and
every task. I can see that db.py is loaded in each task by having the line below in db.py:

logging.info("I was loaded {}".format(random.randint(0,100)))

I understand that each operator can technically be run in a separate machine and it does make
sense that each task runs sort of independently. However, not sure that if this does apply
in case of using LocalExecutor. Now the question is, how I can share the resources (db connections)
across tasks using LocalExecutor.

Mime
View raw message