airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jack (Jira)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-5627) transform function should be optional for s3_file_tranformation_operator
Date Thu, 14 Nov 2019 11:15:00 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16974152#comment-16974152
] 

jack commented on AIRFLOW-5627:
-------------------------------

If you don't want to transform then don't use transform operator.

Operator should do what it was designed to do.

For your use case of moving files from path to path on S3 use copy_object in S3Hook.

 

> transform function should be optional for s3_file_tranformation_operator
> ------------------------------------------------------------------------
>
>                 Key: AIRFLOW-5627
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5627
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: operators
>    Affects Versions: 2.0.0
>            Reporter: Ke Zhu
>            Priority: Major
>
> h3. What happened
> After AIRFLOW-2299, it asks people to choose either {{transform_expression}} or {{transform_script}} when
using S3FileTransformOperator. According to user case like moving objects only without any
content transformation, it has to use some hack like {{transform_script='/bin/cp'}}, which
simply copy a temp file to another temp file. 
>  If you use neither parameter, it will throw exception saying {{Either transform_script
or select_expression must be specified}}. See [https://github.com/apache/airflow/blob/d719e1fd6705a93a0dfefef4b46478ade5e006ea/airflow/operators/s3_file_transform_operator.py#L110-L112]
> h3. Expected outcome
> enhancement like -AIRFLOW-2299- should not force user to use new added feature like
transform_expression/transform_script or choose hacking path to workaround it. these two parameters
should just be optional.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message