beam-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <>
Subject [jira] [Work logged] (BEAM-11006) Allow Failsafe Handling of BigQuery Streaming Writes
Date Fri, 16 Oct 2020 20:53:00 GMT


ASF GitHub Bot logged work on BEAM-11006:

                Author: ASF GitHub Bot
            Created on: 16/Oct/20 20:52
            Start Date: 16/Oct/20 20:52
    Worklog Time Spent: 10m 
      Work Description: dhercher commented on a change in pull request #13055:

File path: sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/
@@ -2002,6 +2007,11 @@ static String getExtractDestinationUri(String extractDestinationDir)
       return toBuilder().setFormatFunction(formatFunction).build();
+    /** Formats the user's type into a {@link TableRow} to be written to an error collector.
+    public Write<T> withFailsafeFormatFunction(SerializableFunction<T, TableRow>
formatFunction) {

Review comment:
       Sounds good, renaming it to
   And adding more in the Javadoc

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:

Issue Time Tracking

            Worklog Id:     (was: 501671)
    Remaining Estimate: 334h  (was: 334h 10m)
            Time Spent: 2h  (was: 1h 50m)

> Allow Failsafe Handling of BigQuery Streaming Writes
> ----------------------------------------------------
>                 Key: BEAM-11006
>                 URL:
>             Project: Beam
>          Issue Type: Improvement
>          Components: extensions-java-gcp
>            Reporter: Dylan Hercher
>            Priority: P2
>              Labels: Clarified, bigquery, google-cloud-bigquery
>   Original Estimate: 336h
>          Time Spent: 2h
>  Remaining Estimate: 334h
> To allow handling of a generic failsafe (of any type) would allow a dead letter queue
to retain the original source data rather than the cleaned version and could be more easily
understood and re-processed.
> The BigQueryIO.Write currently supports `withFormatFunction` which allows for a serializable
function to be applied to each datapoint -> TableRow.  Ideally that same source value
could be converted with a separate function:
> `withFailsafeFormatFunction` taken (InputT -> TableRow) or possibly (InputT ->
OutputT), though the backwards compatibility of OutputT is more difficult.

This message was sent by Atlassian Jira

View raw message