airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <>
Subject [jira] [Commented] (AIRFLOW-5126) Read aws_session_token in extra_config of the aws hook
Date Fri, 11 Oct 2019 06:28:00 GMT


ASF GitHub Bot commented on AIRFLOW-5126:

jojo19893 commented on pull request #6303: AIRFLOW-5126 Read aws_session_token in extra_config
of the aws hook
   ### Description
   Read a temporary token in case it is present this is important, if you don't manage the
session token through `Airflow` but rather you use something like [vault](
to manage these.
   ### Tests
   adjusted existing test to also parse the session token not other impact.
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

> Read aws_session_token in extra_config of the aws hook
> ------------------------------------------------------
>                 Key: AIRFLOW-5126
>                 URL:
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: hooks
>    Affects Versions: 1.10.3
>            Reporter: Alexandre Blanchard
>            Assignee: Johannes Günther
>            Priority: Minor
> Hi,
> Thanks for the great software.
> At my company, we enforce security around our aws account and all accounts must have
mfa activated. To use airflow with my account, I generate a session token with an expiration
date using the command
> {code:java}
> aws sts assume-role --role-arn <the-role-i-want-use> --role-session-name testing
--serial-number <my-personal-mfa-arn> --token-code <code-on-my-mfa-device>
>  --duration-seconds 18000{code}
> This way I retrieve all I need to connect to aws: a aws_access_key_id, a aws_secret_access_key
and a aws_session_token. 
> Currently I'm using boto3 directly in my dag and it's working great. I would like to
use a connection managed by airflow but when I set the parameters this way:
> {code:java}
> airflow connections --add \
>  --conn_id s3_log \
>  --conn_type s3 \
>  --conn_login "<aws_access_key_id>" \
>  --conn_password "<aws_secret_access_key>" \
>  --conn_extra "{ \
>    \"aws_session_token\": \"<aws_session_token>\" \
> }"
> {code}
> With a hook using this connection, I get the error:
> {code:java}
> [2019-08-06 12:31:28,157] {} ERROR - An error occurred (403) when calling
the HeadObject operation: Forbidden
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.7/site-packages/airflow/models/", line 1441,
in _run_raw_task
>     result = task_copy.execute(context=context)
>   File "/usr/local/lib/python3.7/site-packages/airflow/operators/",
line 112, in execute
>     return_value = self.execute_callable()
>   File "/usr/local/lib/python3.7/site-packages/airflow/operators/",
line 117, in execute_callable
>     return self.python_callable(*self.op_args, **self.op_kwargs)
>   File "/root/airflow/dags/", line 48, in download_raw_data
>     dataObject = s3hook.get_key("poc/raw_data.csv.gz", s3_bucket)
>   File "/usr/local/lib/python3.7/site-packages/airflow/hooks/", line 217, in
>     obj.load()
>   File "/usr/local/lib/python3.7/site-packages/boto3/resources/", line 505,
in do_action
>     response = action(self, *args, **kwargs)
>   File "/usr/local/lib/python3.7/site-packages/boto3/resources/", line 83, in
>     response = getattr(parent.meta.client, operation_name)(**params)
>   File "/usr/local/lib/python3.7/site-packages/botocore/", line 357, in _api_call
>     return self._make_api_call(operation_name, kwargs)
>   File "/usr/local/lib/python3.7/site-packages/botocore/", line 661, in _make_api_call
>     raise error_class(parsed_response, operation_name)
> botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject
operation: Forbidden
> {code}
> Reading the code of the hook (,
I understand that the session token is not read from the extra config. The only case a session
token is passed to the boto3 client is when we assume a role. In my case I want to use a role
I have already assumed.
> So my suggestion is to read the session token from the extra config and use it to connect
to aws.
> Do you think it is the right way to do it ? Does this workflow make sense ?
> I am ready to contribute if my suggestion is accepted.
> Regards

This message was sent by Atlassian Jira

View raw message