airflow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ash Berlin-Taylor <...@apache.org>
Subject [VOTE] Accept AIP-13: OpenAPI 3 based API definition
Date Thu, 21 Feb 2019 17:28:26 GMT
Dear Airflow community,

This email calls for a vote to accept Airflow Improvement Proposal 13: OpenAPI 3 based API
definition.

The vote will last for at least 1 week, and until three +1 (binding) votes have been cast

This vote is on the proposal itself, not any specific code or pull request. A failed vote
does not mean the proposal is rejected, just not accepted at this time. (To reject a proposal
entirely is it's own vote)

This is my +1 (binding) vote.

My summary of this proposal: Make the API endpoints discoverable and introspectable by creating
an OpenAPI 3 (a.k.a. Swagger) definition of the current API.

(OpenAPI is the format that Kubernetes uses to define it's API)

The original proposal from the wiiki is included in full at the end of this email (duplicated
from https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103088709 for permanent
record)

Thanks,
Ash

-- Full Proposal --

# Motivation

Operations performed on airflow server need to be run by a CLI which either interacts directly
with the database or in json mode which performs a subset of operations via the existing experimental
api. This complicates authentication and interacting with the airflow installation by requiring
the CLI to have a database user, rather than a web user.

# Requirements

	• Reduce complexity in handling authentication.

	• Provide a layer of abstraction using an industry standard interface to allow authenticated,
remote control for Airflow installations

	• Any API should confirm to existing industry standards

	• API structure should be discoverable

	• API should be versioned to handle any future backwards incompatible schema changes.

# Proposal

Develop an a JSON restful API using the existing Plugin interface, defined by OpenAPI 3.

# Implementation

## API Definition

The API will be defined by a YAML OpenAPI 3 definition, which will be exposed via connexion.
For the time being, the API JSON data structures will follow the existing definitions as defined
in https://airflow.apache.org/api.html.

This means that client libraries can be autogenerated meaning development work is focused
on the API structure and method handlers themselves, and not the infrastructure or client
libraries. This also opens airflow up to being easily controlled by many languages and interfaces
with minimal work required on the server side.

The API base url will have the format {protocol}://{airflowHost}/api/v{apiVersion}/, where:

	• protocol is one of http or https.

	• airflowHost is AIRFLOW__WEBSERVER__BASE_URL.

	• apiVersion is a integer.

## Endpoints

For the initial work, the existing endpoints should not be modified, so that a minor release
is not required to get this structure into the code base

In the interest of maintaining backwards compatibility, the existing API would be maintained
with the same structure at the same endpoints, but using the OpenAPI codebase until it would
be deprecated in future releases. In parallel to this, an API would be defined at /api/v1
with the Additional API resources.

###  Existing

The base url will prefix all endpoints. There will be significantly more endpoints to cover
airflow functionality, but this list covers the existing experimental with some extra intermediary
endpoints.

	• /dags/{dag_id}/dag_runs

	• /dags/{dag_id}/dag_runs/{execution_date}

	• /test

	• /dags/{dag_id}/tasks/{task_id}

	• /dags/{dag_id}/dag_runs/{execution_date}/tasks/{task_id}

	• /dags/{dag_id}/paused/{paused}

	• /latest_runs

	• /pools

	• /pools/{pool_name}

### Additional

These are suggested endpoints to replace existing ones as more in line with the concept of
resources, rather than actions

	• /dags

	• /dags/{dag_id}

	• /dags/{dag_id}/dag_runs/{execution_date}/tasks

	• /dag_runs

		• Alias to /latest_runs
	• /dags/{dag_id}/tasks

	• /healthcheck

		• alias to /test

	• /dags/{dag_id}/status

		• Intended to replace /dags/{dag_id}/paused/{paused} as a restful resource as opposed
to the single field.

## Discovery

To make the API discoverable, connexion provides a self-documenting API UI with swagger. Further
to this, the API structure itself, should be discoverable implementing a concept such as HATEOAS
via HAL.

Each resource or resource listing response should be defined in a structure defined in http://stateless.co/hal_specification.html.

*NOTE:* The exact structure of _links, and _curies are still to be defined.

## Proof of concept

There is an example here of this implementation here https://github.com/apache/airflow/pull/4640/files

Important points to note are:

	• Using plugin entrypoint to isolate functionality https://github.com/apache/airflow/pull/4640/files#diff-2eeaed663bd0d25b7e608891384b7298R416

	• OpenAPI 3 definition https://github.com/apache/airflow/pull/4640/files#diff-93e827c54cbc441d84674c814dcae00e

	• Api Plugin and blueprint hooks https://github.com/apache/airflow/pull/4640/files#diff-5ff8468ade348aeb2ccc273cf3b79550

The sample documentation can be seen here:

	• https://editor.swagger.io/?url=https%3A%2F%2Fraw.githubusercontent.com%2Fdrewsonne%2Fairflow%2Ffeature%2Fflask-restful%2Fairflow%2Fplugin%2Fapi%2Fswagger.yaml


# Considerations

Using a structured definition like OpenAPI may restrict edge cases for complex, or non-json
or non-rest based endpoints in the API.

Complex authentication methods may also be difficult to implement. The existing PoC above
handles the authentication using the existing api authentication methods. In future, API keys
or OAuth keys may be a better solution for API access, instead of requiring a session or an
OAuth login.



Mime
View raw message