airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jon Davies (JIRA)" <>
Subject [jira] [Created] (AIRFLOW-2970) Kubernetes logging is broken
Date Tue, 28 Aug 2018 13:14:00 GMT
Jon Davies created AIRFLOW-2970:

             Summary: Kubernetes logging is broken
                 Key: AIRFLOW-2970
             Project: Apache Airflow
          Issue Type: Bug
            Reporter: Jon Davies
            Assignee: Daniel Imberman

I'm using Airflow with the Kubernetes executor and pod operator. And my DAGs are configured
to do get_log=True and all my DAGs are set to log to stdout and I can see all the logs in
kubectl logs.

I can see that the scheduler logs things to: $AIRFLOW_HOME/logs/scheduler/2018-08-28/*

However, this just consists of:

[2018-08-28 13:03:27,695] {} INFO - Started process (PID=16994) to work on /home/airflow/dags/
[2018-08-28 13:03:27,697] {} INFO - Processing file /home/airflow/dags/
for tasks to queue
[2018-08-28 13:03:27,697] {} INFO - [2018-08-28 13:03:27,697] {}
INFO - Filling up the DagBag from /home/airflow/dags/

If I quickly exec into the executor the scheduler spins up, I can see that things are properly
logged to:

/home/airflow/logs/dag$ tail -f dag-downloader/2018-08-28T13\:05\:07.704072+00\:00/1.log
[2018-08-28 13:05:24,399] {} INFO - [2018-08-28 13:05:24,399] {}
INFO - Event: dag-downloader-015ca48c had an event of type Pending
[2018-08-28 13:05:37,193] {} INFO - [2018-08-28 13:05:37,193] {}
INFO - b'INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Starting new HTTPS
connection (7):\n'
...all other log lines from pod...

However, this executor pod only exists for the duration of the lifetime of the task pod so
the logs are lost pretty much immediately after the task runs.

This message was sent by Atlassian JIRA

View raw message