beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Kirpichov (JIRA)" <j...@apache.org>
Subject [jira] [Created] (BEAM-2864) Support backfill deduplication in BigQueryIO.write()
Date Thu, 07 Sep 2017 20:22:01 GMT
Eugene Kirpichov created BEAM-2864:
--------------------------------------

             Summary: Support backfill deduplication in BigQueryIO.write()
                 Key: BEAM-2864
                 URL: https://issues.apache.org/jira/browse/BEAM-2864
             Project: Beam
          Issue Type: Bug
          Components: sdk-java-gcp
            Reporter: Eugene Kirpichov
            Assignee: Reuven Lax


See https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/603 motivated by SO question
https://stackoverflow.com/questions/46076914/apache-beam-update-bigquery-table-row-with-bigqueryio

Perhaps one way we can do this is make BigQueryIO return a PValue that can be sequenced with
other things, and implement a BigQuery.update() transform that executes a single DML statement
(or a small collection thereof - since DML in BigQuery is very scarce), and let the user sandwich
them together if they would like to.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message