beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Halperin (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (BEAM-383) BigQueryIO: update sink to shard into multiple write jobs
Date Thu, 04 Aug 2016 06:53:20 GMT

     [ https://issues.apache.org/jira/browse/BEAM-383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Daniel Halperin resolved BEAM-383.
----------------------------------
       Resolution: Fixed
    Fix Version/s: 0.3.0-incubating

> BigQueryIO: update sink to shard into multiple write jobs
> ---------------------------------------------------------
>
>                 Key: BEAM-383
>                 URL: https://issues.apache.org/jira/browse/BEAM-383
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-gcp
>            Reporter: Daniel Halperin
>            Assignee: Ian Zhou
>             Fix For: 0.3.0-incubating
>
>
> BigQuery has global limits on both the # files that can be written in a single job and
the total bytes in those files. We should be able to modify BigQueryIO.Write to chunk into
multiple smaller jobs that meet these limits, write to temp tables, and atomically copy into
the destination table.
> This functionality will let us safely stay within BQ's load job limits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message