beam-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From OrielResearch Eila Arich-Landkof <e...@orielresearch.org>
Subject Re: Help with adding python package dependencies when executing pyhton pipeline
Date Tue, 03 Jul 2018 21:42:34 GMT
Based on
https://stackoverflow.com/questions/44423769/how-to-use-google-cloud-storage-in-dataflow-pipeline-run-from-datalab
I tried this:
options = PipelineOptions(flags = ["--requirements_file",
"./requirements.txt"])
the requirements file was generated by:
pip freeze > requirements.txt

But it fires the following error:

CalledProcessError: Command '['/usr/local/envs/py2env/bin/python',
'-m', 'pip', 'install', '--download',
'/tmp/dataflow-requirements-cache', '-r', 'requirements.txt',
'--no-binary', ':all:']' returned non-zero exit status 1


any suggestion?
Thanks,
Eila

On Tue, Jul 3, 2018 at 5:25 PM, OrielResearch Eila Arich-Landkof <
eila@orielresearch.org> wrote:

> thank you. where do i add the reference to requirements.txt? can i do it
> from the pipline options code?
>
> On Tue, Jul 3, 2018 at 5:13 PM, Lukasz Cwik <lcwik@google.com> wrote:
>
>> Take a look at https://beam.apache.org/docume
>> ntation/sdks/python-pipeline-dependencies/
>>
>> On Tue, Jul 3, 2018 at 2:09 PM OrielResearch Eila Arich-Landkof <
>> eila@orielresearch.org> wrote:
>>
>>> Hello all,
>>>
>>>
>>> I am using the python code to run my pipeline. similar to the following:
>>>
>>> options = PipelineOptions()google_cloud_options = options.view_as(GoogleCloudOptions)google_cloud_options.project
= 'my-project-id'google_cloud_options.job_name = 'myjob'google_cloud_options.staging_location
= 'gs://your-bucket-name-here/staging'google_cloud_options.temp_location = 'gs://your-bucket-name-here/temp'options.view_as(StandardOptions).runner
= 'DataflowRunner'
>>>
>>>
>>>
>>> I would like to add *pandas-gbq* package installation to my workers.
>>> What would be the recommendation to do so. Can I add it to the
>>> PipelineOptions()?
>>> I remember that there are few options, one of them was with creating a
>>> requirements text file but I can not remember where I saw it and if it is
>>> the simplest way when running the pipeline from datalab.
>>>
>>> Thanks you for any reference!
>>>
>>> --
>>> Eila
>>> www.orielresearch.org
>>> https://www.meetu <https://www.meetup.com/Deep-Learning-In-Production/>
>>> p.co <https://www.meetup.com/Deep-Learning-In-Production/>m/Deep-Le
>>> arning-In-Production/
>>> <https://www.meetup.com/Deep-Learning-In-Production/>
>>>
>>>
>>>
>
>
> --
> Eila
> www.orielresearch.org
> https://www.meetu <https://www.meetup.com/Deep-Learning-In-Production/>
> p.co <https://www.meetup.com/Deep-Learning-In-Production/>m/Deep-
> Learning-In-Production/
> <https://www.meetup.com/Deep-Learning-In-Production/>
>
>
>


-- 
Eila
www.orielresearch.org
https://www.meetu <https://www.meetup.com/Deep-Learning-In-Production/>p.co
<https://www.meetup.com/Deep-Learning-In-Production/>
m/Deep-Learning-In-Production/
<https://www.meetup.com/Deep-Learning-In-Production/>

Mime
View raw message