airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AIRFLOW-2299) Add S3 Select functionarity to S3FileTransformOperator
Date Tue, 17 Apr 2018 08:54:00 GMT

    [ https://issues.apache.org/jira/browse/AIRFLOW-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440619#comment-16440619
] 

ASF subversion and git services commented on AIRFLOW-2299:
----------------------------------------------------------

Commit 6e82f1d7c9fa391c636a0155cdb19aa6cbda0821 in incubator-airflow's branch refs/heads/master
from [~sekikn]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=6e82f1d ]

[AIRFLOW-2299] Add S3 Select functionarity to S3FileTransformOperator

Currently, S3FileTransformOperator downloads the
whole file from S3
before transforming and uploading it. Adding
extraction feature using
S3 Select to this operator improves its efficiency
and usablitily.

Closes #3227 from sekikn/AIRFLOW-2299


> Add S3 Select functionarity to S3FileTransformOperator
> ------------------------------------------------------
>
>                 Key: AIRFLOW-2299
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2299
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: aws, operators
>            Reporter: Kengo Seki
>            Assignee: Kengo Seki
>            Priority: Major
>             Fix For: 2.0.0
>
>
> S3FileTransformOperator downloads the whole file from S3 before transforming and uploading
it, but it's inefficient if the original file is large but the necessary part is small.
> S3 Select, [which became GA recently|https://aws.amazon.com/about-aws/whats-new/2018/04/amazon-s3-select-is-now-generally-available/],
can improve its efficiency and usablitily.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message