beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pei He (JIRA)" <>
Subject [jira] [Commented] (BEAM-702) Simple pattern for per-bundle and per-DoFn Closeable resources
Date Tue, 04 Oct 2016 18:47:21 GMT


Pei He commented on BEAM-702:

The feature request is for something like this?
class MyDoFn extends DoFn<String, String> {
  private DBWriter writer = null;
  public void startBundle(Context c) {
    writer = createDBWriter();

  public void processElement(ProcessContext c) {

I think runners can recognize Closable resources and their scopes through annotations, and
they can generate the code the execute close() after the resources become out of the scope.

But, I think the real usefulness is more about "closed automatically" in both success and
failure conditions.
Closing the resources manually in finishBundle() or teardown() work if the pipeline doesn't
failure. Otherwise, if the pipeline fails, the resources might be left open.

This is the problem we have in BigQueryIO.Read, where we start external BQ extract jobs. However,
if the pipeline fails, there is no hook to cancel them.

In summary for my 2 cents:
+1 for introducing mechanisms to recognize Closable resources in the model
And, it needs to work in both success and failure conditions.

> Simple pattern for per-bundle and per-DoFn Closeable resources
> --------------------------------------------------------------
>                 Key: BEAM-702
>                 URL:
>             Project: Beam
>          Issue Type: Improvement
>            Reporter: Eugene Kirpichov
> Dealing with Closeable resources inside a processElement call is easy: simply use try-with-resources.
> However, bundle- or DoFn-scoped resources, such as long-lived database connections, are
less convenient to deal with: you have to open them in startBundle and conditionally close
in finishBundle (likewise setup/teardown), taking special care if there's multiple resources
to close all of them.
> Perhaps we should provide something like Guava's Closer to DoFn's
Ideally, the user would need to only write a startBundle() or setup() method, but not write
finishBundle() or teardown() - resources would be closed automatically.

This message was sent by Atlassian JIRA

View raw message