beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luke Cwik (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-2101) (Potential) Reading public GCS files requires authentication
Date Thu, 27 Apr 2017 17:13:04 GMT

    [ https://issues.apache.org/jira/browse/BEAM-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987056#comment-15987056
] 

Luke Cwik commented on BEAM-2101:
---------------------------------

We ran into the same problem within the Java code base, the issue is that we assumed we always
needed credentials instead of trying to get credentials and if that failed attempt the call
anyways. This was an issue for all GCP IOs like GCS, Datastore, BigQuery

> (Potential) Reading public GCS files requires authentication
> ------------------------------------------------------------
>
>                 Key: BEAM-2101
>                 URL: https://issues.apache.org/jira/browse/BEAM-2101
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py
>            Reporter: Ahmet Altay
>
> cc: [~sb2nov]
> [~tibor.kiss@gmail.com] reported this on hackathon, while running wordcount. Even though
plain gsutil works fine on the public file, wordcount fails. It is possible that some other
authentication is required as part of the validation.
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "apache_beam/io/textio.py", line 390, in __init__
>     skip_header_lines=skip_header_lines)
>   File "apache_beam/io/textio.py", line 89, in __init__
>     validate=validate)
>   File "apache_beam/io/filebasedsource.py", line 103, in __init__
>     self._validate()
>   File "apache_beam/utils/value_provider.py", line 101, in _f
>     return fnc(self, *args, **kwargs)
>   File "apache_beam/io/filebasedsource.py", line 163, in _validate
>     match_result = FileSystems.match([pattern], limits=[1])[0]
>   File "apache_beam/io/filesystems.py", line 81, in match
>     return filesystem.match(patterns, limits)
>   File "apache_beam/io/gcp/gcsfilesystem.py", line 100, in match
>     raise BeamIOError("Match operation failed", exceptions)
> apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions {'gs://apache-beam-samples/shakespeare/hamlet.txt':
ApplicationDefaultCredentialsError('The Application Default Credentials are not available.
They are available if running in Google Compute Engine. Otherwise, the environment variable
GOOGLE_APPLICATION_CREDENTIALS must be defined pointing to a file defining the credentials.
See https://developers.google.com/accounts/docs/application-default-credentials for more information.',)}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message