airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jon Davies (JIRA)" <j...@apache.org>
Subject [jira] [Created] (AIRFLOW-2970) Kubernetes logging is broken
Date Tue, 28 Aug 2018 13:14:00 GMT
Jon Davies created AIRFLOW-2970:
-----------------------------------

             Summary: Kubernetes logging is broken
                 Key: AIRFLOW-2970
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2970
             Project: Apache Airflow
          Issue Type: Bug
            Reporter: Jon Davies
            Assignee: Daniel Imberman


I'm using Airflow with the Kubernetes executor and pod operator. And my DAGs are configured
to do get_log=True and all my DAGs are set to log to stdout and I can see all the logs in
kubectl logs.

I can see that the scheduler logs things to: $AIRFLOW_HOME/logs/scheduler/2018-08-28/*

However, this just consists of:

{code:java}
[2018-08-28 13:03:27,695] {jobs.py:385} INFO - Started process (PID=16994) to work on /home/airflow/dags/dag.py
[2018-08-28 13:03:27,697] {jobs.py:1782} INFO - Processing file /home/airflow/dags/dag.py
for tasks to queue
[2018-08-28 13:03:27,697] {logging_mixin.py:95} INFO - [2018-08-28 13:03:27,697] {models.py:258}
INFO - Filling up the DagBag from /home/airflow/dags/dag.py
{code}

If I quickly exec into the executor the scheduler spins up, I can see that things are properly
logged to:

{code:java}
/home/airflow/logs/dag$ tail -f dag-downloader/2018-08-28T13\:05\:07.704072+00\:00/1.log
[2018-08-28 13:05:24,399] {logging_mixin.py:95} INFO - [2018-08-28 13:05:24,399] {pod_launcher.py:112}
INFO - Event: dag-downloader-015ca48c had an event of type Pending
...
[2018-08-28 13:05:37,193] {logging_mixin.py:95} INFO - [2018-08-28 13:05:37,193] {pod_launcher.py:95}
INFO - b'INFO:botocore.vendored.requests.packages.urllib3.connectionpool:Starting new HTTPS
connection (7): blah-blah.s3.eu-west-1.amazonaws.com\n'
...
...all other log lines from pod...
{code}

However, this executor pod only exists for the duration of the lifetime of the task pod so
the logs are lost pretty much immediately after the task runs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message