airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Hofman (JIRA)" <>
Subject [jira] [Created] (AIRFLOW-2966) KubernetesExecutor + namespace quotas kills scheduler if the pod can't be launched
Date Mon, 27 Aug 2018 16:32:00 GMT
John Hofman created AIRFLOW-2966:

             Summary: KubernetesExecutor + namespace quotas kills scheduler if the pod can't
be launched
                 Key: AIRFLOW-2966
             Project: Apache Airflow
          Issue Type: Bug
          Components: scheduler
    Affects Versions: 1.10
         Environment: Kubernetes 1.9.8
            Reporter: John Hofman

When running Airflow in Kubernetes with the KubernetesExecutor and resource quota's set on
the namespace Airflow is deployed in. If the scheduler tries to launch a pod into the namespace
that exceeds the namespace limits it gets an ApiException, and crashes the scheduler.

This stack trace is an example of the ApiException from the kubernetes client:
[2018-08-27 09:51:08,516] {} ERROR - Exception when attempting to create
Namespaced Pod.
Traceback (most recent call last):
File "/src/apache-airflow/airflow/contrib/kubernetes/", line 55, in run_pod_async
resp = self._client.create_namespaced_pod(body=req, namespace=pod.namespace)
File "/usr/local/lib/python3.6/site-packages/kubernetes/client/apis/", line
6057, in create_namespaced_pod
(data) = self.create_namespaced_pod_with_http_info(namespace, body, **kwargs)
File "/usr/local/lib/python3.6/site-packages/kubernetes/client/apis/", line
6142, in create_namespaced_pod_with_http_info
File "/usr/local/lib/python3.6/site-packages/kubernetes/client/", line 321, in
_return_http_data_only, collection_formats, _preload_content, _request_timeout)
File "/usr/local/lib/python3.6/site-packages/kubernetes/client/", line 155, in
File "/usr/local/lib/python3.6/site-packages/kubernetes/client/", line 364, in
File "/usr/local/lib/python3.6/site-packages/kubernetes/client/", line 266, in POST
File "/usr/local/lib/python3.6/site-packages/kubernetes/client/", line 222, in request
raise ApiException(http_resp=r) (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'b00e2cbb-bdb2-41f3-8090-824aee79448c',
'Content-Type': 'application/json', 'Date': 'Mon, 27 Aug 2018 09:51:08 GMT', 'Content-Length':
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods
\"podname-ec366e89ef934d91b2d3ffe96234a725\" is forbidden: exceeded quota: compute-resources,
requested: limits.memory=4Gi, used: limits.memory=6508Mi, limited: limits.memory=10Gi","reason":"Forbidden","details":{"name":"podname-ec366e89ef934d91b2d3ffe96234a725","kind":"pods"},"code":403}{code}

I would expect the scheduler to catch the Exception and at least mark the task as failed,
or better yet retry the task later.



This message was sent by Atlassian JIRA

View raw message