airflow-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <>
Subject [jira] [Commented] (AIRFLOW-314) bigquery cursor run_table_upsert method may fail for large datasets
Date Tue, 05 Jul 2016 23:31:10 GMT


ASF subversion and git services commented on AIRFLOW-314:

Commit 2d7c830858d3c220888d77120148c14c97f892df in incubator-airflow's branch refs/heads/master
from [~moirat]
[;h=2d7c830 ]

[AIRFLOW-314] Fix BigQuery cursor run_table_upsert method

Closes #1652 from mtagle/fix_bq_table_upsert

By default, bigquery will only return 50 tables when you ask for a list
of all the tables in a datatset. If you are trying to upsert a table
that exists, but you have more than 50 tables, the run_table_upsert
method may conclude that the table doesn't exist, and try to insert it,
and bigquery will error saying that the table does exist.

This fix checks if the response has pagination data, and looks at all
the pages, rather than just the first one, to see if the table exists.

> bigquery cursor run_table_upsert method may fail for large datasets
> -------------------------------------------------------------------
>                 Key: AIRFLOW-314
>                 URL:
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: contrib, hooks
>            Reporter: Moira Tagle
>            Assignee: Moira Tagle
> If a dataset has more than 50 tables, run_table_upsert may fail to find the table it's
looking for, and incorrectly attempt in insert it (rather than update it)

This message was sent by Atlassian JIRA

View raw message