beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Jugel (JIRA)" <>
Subject [jira] [Commented] (BEAM-1909) BigQuery read transform fails for DirectRunner when querying non-US regions
Date Mon, 22 May 2017 13:11:04 GMT


Uwe Jugel commented on BEAM-1909:

Here are my latest test results regarding this issue:

# I just tried and failed to query across regions:
SELECT a.user_id FROM `test_dummy_eu.user_details` a, `test_dummy_us.user_details` b WHERE
a.user_id = b.user_id
-- Error: Cannot process data across locations: EU,US
# Since we cannot query across regions, I tried to determine the single location/region of
the data source(s). Therefore, I tried to dry-run the query and then check the location of
the bq-internal temp table. However, this does not work, as the temp table always reports
{{None}}, i.e., US as location, even if the source table is in an EU dataset.
# However, we can still *transfer the data to our temp table from the queries own temp table
using a {{CopyJob}}* that works across regions. Here is a gist that demonstrates how to do
this via the BigQuery Python SDK:

I believe, using a {{CopyJob}} this is the appropriate way of copying any table to a temp
table, also and especially for non-query sources, which we currently query with a {{SELECT
*}}, which may be billed to the user (\?), even if it should be covered by the free data export
quotas (see here and here:


|CopyJob (py)||
|copy job (API)||
|BQ-read in DataFlow == BQ-export||
|free BQ-export||
|costly(\?) "SELECT *" for non-queries||

> BigQuery read transform fails for DirectRunner when querying non-US regions
> ---------------------------------------------------------------------------
>                 Key: BEAM-1909
>                 URL:
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py
>            Reporter: Chamikara Jayalath
> See:
> This should be fixed by creating the temp dataset and table in the correct region.
> cc: [~sb2nov]

This message was sent by Atlassian JIRA

View raw message