beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-383) BigQueryIO: update sink to shard into multiple write jobs
Date Wed, 24 Aug 2016 16:51:20 GMT

    [ https://issues.apache.org/jira/browse/BEAM-383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435276#comment-15435276
] 

ASF GitHub Bot commented on BEAM-383:
-------------------------------------

GitHub user dhalperi opened a pull request:

    https://github.com/apache/incubator-beam/pull/877

    [BEAM-383] BigQueryIO.Write: raise size limit to 11 TiB

    BigQuery has changed their total size quota to 12 TiB.
    https://cloud.google.com/bigquery/quota-policy#import

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dhalperi/incubator-beam bigquery-limits

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-beam/pull/877.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #877
    
----
commit 02648a968992617ba77393d889c8df9d0191b9ea
Author: Dan Halperin <dhalperi@google.com>
Date:   2016-08-24T16:49:46Z

    BigQueryIO.Write: raise size limit to 11 TiB
    
    BigQuery has changed their total size quota to 12 TiB.
    https://cloud.google.com/bigquery/quota-policy#import

----


> BigQueryIO: update sink to shard into multiple write jobs
> ---------------------------------------------------------
>
>                 Key: BEAM-383
>                 URL: https://issues.apache.org/jira/browse/BEAM-383
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-gcp
>            Reporter: Daniel Halperin
>            Assignee: Ian Zhou
>             Fix For: 0.3.0-incubating
>
>
> BigQuery has global limits on both the # files that can be written in a single job and
the total bytes in those files. We should be able to modify BigQueryIO.Write to chunk into
multiple smaller jobs that meet these limits, write to temp tables, and atomically copy into
the destination table.
> This functionality will let us safely stay within BQ's load job limits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message