beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Davor Bonaci (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-190) Dead-letter drop for bad BigQuery records
Date Wed, 13 Apr 2016 22:14:25 GMT

    [ https://issues.apache.org/jira/browse/BEAM-190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15240120#comment-15240120
] 

Davor Bonaci commented on BEAM-190:
-----------------------------------

Quarantining failing elements was proposed several times.

It is somewhat orthogonal to losing data -- we can re-process quarantined elements at a later
point (e.g., periodically retry, after an update, etc.). In a general case, however, the pipeline
is unlikely to make much progress with quarantined elements, unless we figure out something
"smart" how to unblock progress. It is unclear what the value would be without that.

I would probably try to treat this as a backend-specific feature, as opposed to part of Beam.
Thoughts?

> Dead-letter drop for bad BigQuery records
> -----------------------------------------
>
>                 Key: BEAM-190
>                 URL: https://issues.apache.org/jira/browse/BEAM-190
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-core
>            Reporter: Mark Shields
>            Assignee: Frances Perry
>
> If a BigQuery insert fails for data-specific rather than structural reasons (eg cannot
parse a date) then the bundle will be retried indefinitely, first by BigQueryTableInserter.insertAll
then by the overall production retry logic of the underlying runner.
> Better would be to allow customer to specify a dead-letter store for records such as
those so that overall processing can continue while bad records are quarantined.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message