I didn't have any other changes.
I ran the tests with a clean virtualenv as you suggested and it works now. :)

Thanks Ahmet and Chamikara!

On Tue, Jun 4, 2019 at 6:36 AM Chamikara Jayalath <chamikara@google.com> wrote:
Sounds like your input job was somehow incompatible with the Dataflow worker. Running using a clean virtual env should help verify as Ahmet mentioned. 

On Mon, Jun 3, 2019 at 5:44 PM Ahmet Altay <altay@google.com> wrote:
Do you have any other changes? Are you trying from head with a clean virtual environment?

If you can share a link to dataflow job (in the apache-beam-testing GCP project), we can try to look at additional logs as well.

On Mon, Jun 3, 2019 at 1:42 PM Tanay Tummalapalli <ttanay100@gmail.com> wrote:
Hi everyone,

I ran the Integration Tests - BigQueryStreamingInsertTransformIntegrationTests[1] and BigQueryFileLoadsIT[2] on the master branch locally, with the following command:
./scripts/run_integration_test.sh --test_opts --tests=apache_beam.io.gcp.bigquery_test:BigQueryStreamingInsertTransformIntegrationTests
The Dataflow jobs for the tests failed with the following error:
root: INFO: 2019-06-03T18:36:53.021Z: JOB_MESSAGE_ERROR: Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 649, in do_work
work_executor.execute()
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 150, in execute
test_shuffle_sink=self._test_shuffle_sink)
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 116, in create_operation
is_streaming=False)
File "apache_beam/runners/worker/operations.py", line 962, in apache_beam.runners.worker.operations.create_operation
op = BatchGroupAlsoByWindowsOperation(
File "dataflow_worker/shuffle_operations.py", line 219, in dataflow_worker.shuffle_operations.BatchGroupAlsoByWindowsOperation.__init__
self.windowing = deserialize_windowing_strategy(self.spec.window_fn)
File "dataflow_worker/shuffle_operations.py", line 207, in dataflow_worker.shuffle_operations.deserialize_windowing_strategy
return pickler.loads(serialized_data)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py", line 248, in loads
c = base64.b64decode(encoded)
File "/usr/lib/python2.7/base64.py", line 78, in b64decode
raise TypeError(msg)
TypeError: Incorrect padding


I tested the same tests on the 2.13.0-RC#2 branch as well and they passed. These tests also don't fail in the most recent Python post-commit tests[3-5].

Keeping in mind the recent b64 changes in BQ, none of the tests in the test classes mentioned above makes use of a "BYTES" type field.
Would love to get pointers to possible reasons. 

Thank You
- TT