airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ash Berlin-Taylor (JIRA)" <>
Subject [jira] [Commented] (AIRFLOW-1756) S3 Task Handler Cannot Read Logs With New S3Hook
Date Wed, 25 Oct 2017 09:17:05 GMT


Ash Berlin-Taylor commented on AIRFLOW-1756:

I'm going to test this later today, but I can't see where get_key is being returned as a dict.
In both master and v1-9-test the s3hook.get_key function looks like this:

    def get_key(self, key, bucket_name=None):
        Returns a boto3.S3.Key object

        :param key: the path to the key
        :type key: str
        :param bucket_name: the name of the bucket
        :type bucket_name: str
        if not bucket_name:
            (bucket_name, key) = self.parse_s3_url(key)
        return self.get_conn().get_object(Bucket=bucket_name, Key=key)

Do you have a full stack trace of the error please?

> S3 Task Handler Cannot Read Logs With New S3Hook
> ------------------------------------------------
>                 Key: AIRFLOW-1756
>                 URL:
>             Project: Apache Airflow
>          Issue Type: Bug
>    Affects Versions: 1.9.0
>            Reporter: Colin Son
> With the changes to the S3Hook, it seems like it cannot read the S3 task logs.
> In the `s3_read` in the
> {code}
> s3_key = self.hook.get_key(remote_log_location)
> if s3_key:
>     return s3_key.get_contents_as_string().decode()
> {code}
> Since the s3_key object is now a dict, you cannot call `get_contents_as_string()` on
a dict object. You have to use the S3Hook's `read_key()` method to read the contents of the
task logs now. 

This message was sent by Atlassian JIRA

View raw message