beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Delfour (JIRA)" <>
Subject [jira] [Created] (BEAM-3757) Shuffle read failed using python 2.2.0
Date Tue, 27 Feb 2018 21:03:00 GMT
Jonathan Delfour created BEAM-3757:

             Summary: Shuffle read failed using python 2.2.0
                 Key: BEAM-3757
             Project: Beam
          Issue Type: Bug
          Components: runner-dataflow
    Affects Versions: 2.2.0
         Environment: gcp, macos
            Reporter: Jonathan Delfour
            Assignee: Thomas Groh


First issue is that the beam 2.3.0 python SDK is apparently not working on GCP. It gets stuck: 
Workflow failed. Causes: (bf832d44290fbf41): The Dataflow appears to be stuck. You can get
help with Cloud Dataflow at 
I tried two times.

Reverting back to 2.2.0: it usually works but today, after > 1 hour of processing, and
30 workers used, I get a failure with these in the logs:

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/", line 582,
in do_work
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/", line 167, in
  File "dataflow_worker/", line 49, in dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
    def start(self):
  File "dataflow_worker/", line 50, in dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
    with self.scoped_start_state:
  File "dataflow_worker/", line 65, in dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
    with self.shuffle_source.reader() as reader:
  File "dataflow_worker/", line 67, in dataflow_worker.shuffle_operations.GroupedShuffleReadOperation.start
    for key_values in reader:
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/", line 406, in __iter__
    for entry in entries_iterator:
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/", line 248, in next
    return next(self.iterator)
  File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/", line 206, in __iter__
    chunk, next_position = self.reader.Read(start_position, end_position)
  File "third_party/windmill/shuffle/python/shuffle_client.pyx", line 138, in shuffle_client.PyShuffleReader.Read
IOError: Shuffle read failed: INTERNAL: Received RST_STREAM with error code 2  talking to

i also get some information message:

Refusing to split <dataflow_worker.shuffle.GroupedShuffleRangeTracker object at 0x7f03a00fe790>
at '\x00\x00\x00\t\x1d\x14\x87\xa3\x00\x01': unstarted

For the flow, I am extracting data from BQ, cleaning using pandas, exporting as a csv file,
gzipping and uploading the compressed file to a bucket using decompressive transcoding (csv
export, gzip compression and upload are in the same 'worker' as they are done in the same

This message was sent by Atlassian JIRA

View raw message