airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dubey, Somesh" <somesh.du...@nordstrom.com>
Subject airflow 1.9 tasks randomly failing | k8 - hive
Date Fri, 24 Aug 2018 13:40:04 GMT
We are on AirFlow 1.9 on Kubernetes. We have many hive DAGs which we call via JDBC (via rest
end point) and randomly tasks fail. One day one would fail and next day some other.
The pods are all healthy and there is no node eviction/any issues and retries typically works.
Also even when the tasks fails it completes on the hive side successfully.
We do not get good logs of the failure and typically get partial sql in sysout in the task
log with no error message. Loglevel increased to 5 in jdbc connection.
This only happens in production.

Not been able to reproduce the issue in dev.
The closest we have gotten in dev to replicate similar behavior is to kill the pid (manually
killing airflow run --raw) in worker node.
That case also things run fine on hive side but task fails in Airflow.
Similar task log with no error is received.

Have you seen this kind of behavior. Any help is much appreciated.

Thanks so much,
Somesh





Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message