airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-1795) S3Hook no longer accepts s3_conn_id breaking build in ops/sensors and back-compat
Date Sat, 18 Nov 2017 13:09:07 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258057#comment-16258057
] 

ASF subversion and git services commented on AIRFLOW-1795:
----------------------------------------------------------

Commit 98df0d6e3b2e2b439ab46d6c9ba736777202414a in incubator-airflow's branch refs/heads/master
from [~ashb]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=98df0d6 ]

[AIRFLOW-1795] Correctly call S3Hook after migration to boto3

In the migration of S3Hook to boto3 the connection
ID parameter changed
to `aws_conn_id`. This fixes the uses of
`s3_conn_id` in the code base
and adds a note to UPDATING.md about the change.

In correcting the tests for S3ToHiveTransfer I
noticed that
S3Hook.get_key was returning a dictionary, rather
then the S3.Object as
mentioned in it's doc string. The important thing
that was missing was
ability to get the key name from the return a call
to get_wildcard_key.

Closes #2795 from
ashb/AIRFLOW-1795-s3hook_boto3_fixes


> S3Hook no longer accepts s3_conn_id breaking build in ops/sensors and back-compat 
> ----------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-1795
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1795
>             Project: Apache Airflow
>          Issue Type: Bug
>    Affects Versions: 1.9.0
>            Reporter: Ash Berlin-Taylor
>             Fix For: 1.9.0
>
>
> Found whilst testing Airflow 1.9.0rc1
> Previously the S3Hook accepted a parameter of {{s3_conn_id}}. As part of AIRFLOW-1520
we moved S3Hook to have a superclass of AWSHook, which accepts a {{aws_conn_id}} parameter
instead.
> This break back-compat generally, and more specifically it breaks the built in S3KeySensor
which does this:
> {code}
>     def poke(self, context):
>         from airflow.hooks.S3_hook import S3Hook
>         hook = S3Hook(s3_conn_id=self.s3_conn_id)
> {code}
> There are a few other instances of s3_conn_id in the code base that will also probably
need updating/tweaking.
> My first though was to add a shim mapping s3_conn_id to aws_conn_id in the S3Hook with
a deprecation warning but the surface area with places where this is exposed is larger. I
could add such a deprecation warning to all of these. Anyone have thoughts as to best way?
> - Rename all instances with deprecation warnings.
> - S3Hook accepts {{s3_conn_id}} and passes down to {{aws_conn_id}} in superclass.
> - Update existing references in code base to {{aws_conn_id}}, and not in updating about
need to update in user code. (This is my least preferred option.)
> {noformat}
> airflow/operators/redshift_to_s3_operator.py
> 33:    :param s3_conn_id: reference to a specific S3 connection
> 34:    :type s3_conn_id: string
> 51:            s3_conn_id='s3_default',
> 62:        self.s3_conn_id = s3_conn_id
> 69:        self.s3 = S3Hook(s3_conn_id=self.s3_conn_id)
> airflow/operators/s3_file_transform_operator.py
> 40:    :param source_s3_conn_id: source s3 connection
> 41:    :type source_s3_conn_id: str
> 44:    :param dest_s3_conn_id: destination s3 connection
> 45:    :type dest_s3_conn_id: str
> 62:            source_s3_conn_id='s3_default',
> 63:            dest_s3_conn_id='s3_default',
> 68:        self.source_s3_conn_id = source_s3_conn_id
> 70:        self.dest_s3_conn_id = dest_s3_conn_id
> 75:        source_s3 = S3Hook(s3_conn_id=self.source_s3_conn_id)
> 76:        dest_s3 = S3Hook(s3_conn_id=self.dest_s3_conn_id)
> airflow/operators/s3_to_hive_operator.py
> 74:    :param s3_conn_id: source s3 connection
> 75:    :type s3_conn_id: str
> 102:            s3_conn_id='s3_default',
> 119:        self.s3_conn_id = s3_conn_id
> 130:        self.s3 = S3Hook(s3_conn_id=self.s3_conn_id)
> airflow/operators/sensors.py
> 504:    :param s3_conn_id: a reference to the s3 connection
> 505:    :type s3_conn_id: str
> 514:            s3_conn_id='s3_default',
> 531:        self.s3_conn_id = s3_conn_id
> 535:        hook = S3Hook(s3_conn_id=self.s3_conn_id)
> 568:            s3_conn_id='s3_default',
> 576:        self.s3_conn_id = s3_conn_id
> 582:        hook = S3Hook(s3_conn_id=self.s3_conn_id)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message