beam-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Altay <al...@google.com>
Subject Re: google.cloud.bigQuery version on workers - please HELP
Date Sat, 14 Jul 2018 01:02:39 GMT
On Thu, Jul 12, 2018 at 7:35 PM, OrielResearch Eila Arich-Landkof <
eila@orielresearch.org> wrote:

> Hi Ahmat,
>
>
> I have received the version from the worker using the following commands:
>
> *from google.cloud import bigquery*
> *logging.info <http://logging.info>('bigquery.__version__ is %s
> ',bigquery.__version__)*
>
> I tried few time to install the google-cloud-bigquery on the workers using
> setup.py with no much success:
>
> *from setuptools import setup, find_packages*
>
> *setup(*
> *  name='label-or',*
> *  version='1.0.0',*
> *  packages=find_packages(),*
> *  keywords=[*
> *  ],*
> *  license="Apache Software License",*
> *  install_requires=[*
> *    'google-cloud-bigquery==0.28.0',*
> *  ],*
> *  package_data={*
> *  },*
> *  data_files=[],*
> *)*
>
>
> on the job report UI, this message is being reported ( I dont know if it
> is relevant to the dependencies)
> SDK version
> Google Cloud Dataflow SDK for Python 2.0.0
>  A newer version of this SDK is available.
> <https://cloud.google.com/dataflow/support?hl=en_US>
>

Yes, there is some related to the SDK version you are using. Dataflow
worker containers will have different dependencies for each new SDK
version. 2.0.0 is an old version, that explain why you were seeing the
0.23.0 as the installed version.


>
>
> I was able to upgrade to bigquery.__version__ is 0.25.0 but not to 0.28.0
> (which has different API) could you please advice what am I missing? Is it
> impossible to work with newer version?
>

Beam support BigQuery up to 0.25.0 version. There was a recent attempt to
upgrade it and it uncovered issues due to the API differences. (Details:
https://github.com/apache/beam/pull/5895). There is a recent push for Beam
to upgrade all dependencies to their latest version, and I I assume this
will be addressed as part of it.

Unfortunately, before that fix it is not possible to use the latest version
of the bigquery.


>
> Many thanks,
> Eila
>
>
> On Thu, Jul 12, 2018 at 9:40 PM, Ahmet Altay <altay@google.com> wrote:
>
>> Hi Eila,
>>
>> You can find a list of dependencies installed in Dataflow workers in [1].
>> Dataflow workers will have a set of dependencies that will satisfy the
>> requirements from setup.py.
>>
>> Which bigquery library you are using? There is
>> a google-cloud-bigquery==0.25.0 dependency, I am not sure where the
>> 0.23.0 is coming from.
>>
>> Workers do not pick up libraries from the client environment as part of
>> the job submission. I am not sure how datalab UI integration works
>> however you have a few options for installing any set of dependencies in
>> the workers. Using requirements.txt is one of those options.
>>
>> Ahmet
>>
>> [1] https://cloud.google.com/dataflow/docs/concepts/sdk-work
>> er-dependencies#version-250_1
>>
>> On Thu, Jul 12, 2018 at 8:51 AM, OrielResearch Eila Arich-Landkof <
>> eila@orielresearch.org> wrote:
>>
>>> Hi all,
>>>
>>> I am running python pipeline with google.cloud.bigquery library.
>>> on the local runner, everything runs great
>>> bigquery.__version__ is 0.28.0
>>>
>>> on the dataflow runner, the version is 0.23.0 bigquery.__version__ is
>>> 0.23.0
>>> and there are many API changes between these versions.
>>>
>>> What will be the best way to change the installed version on the
>>> workers? I was assuming the the worker has all the master machine libraries
>>> installed when the execution is done from datalab - is that true?
>>> I am not generating any requirements.txt, the execution is done through
>>> the run button on the datalab UI.
>>>
>>>
>>> please help me solve that issue.
>>> Thanks,
>>> --
>>> Eila
>>> www.orielresearch.org
>>> https://www.meetu <https://www.meetup.com/Deep-Learning-In-Production/>
>>> p.co <https://www.meetup.com/Deep-Learning-In-Production/>m/Deep-Le
>>> arning-In-Production/
>>> <https://www.meetup.com/Deep-Learning-In-Production/>
>>>
>>>
>>>
>>
>
>
> --
> Eila
> www.orielresearch.org
> https://www.meetu <https://www.meetup.com/Deep-Learning-In-Production/>
> p.co <https://www.meetup.com/Deep-Learning-In-Production/>m/Deep-
> Learning-In-Production/
> <https://www.meetup.com/Deep-Learning-In-Production/>
>
>
>

Mime
View raw message