beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Liu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (BEAM-5108) Improve Python test framework to prevent streaming pipeline leaks
Date Wed, 08 Aug 2018 16:35:00 GMT

     [ https://issues.apache.org/jira/browse/BEAM-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mark Liu updated BEAM-5108:
---------------------------
    Summary: Improve Python test framework to prevent streaming pipeline leaks  (was: Python
test framework should prevent streaming pipeline leaks)

> Improve Python test framework to prevent streaming pipeline leaks
> -----------------------------------------------------------------
>
>                 Key: BEAM-5108
>                 URL: https://issues.apache.org/jira/browse/BEAM-5108
>             Project: Beam
>          Issue Type: Task
>          Components: testing
>            Reporter: Mark Liu
>            Priority: Major
>
> Recently, few Python streaming pipelines on Dataflow apache-beam-testing project run
for more than 5 days. This look like a leaking from Jenkins job that runs e2e integration
tests.
> Test framework has a pipeline resource clean up and applies to all integration test,
which is defined in [TestDataflowRunner|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py#L67].
However, the cancellation may failed in a special case, like following (from [this Jenkins
run|https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Python_Verify/5636/consoleFull]):
> {quote}
> Workflow modification failed. Causes: (c53cc746f7bc7f49): Operation cancel not allowed
for job 2018-08-01_13_10_24-5019826606522054507. Job is not yet ready for canceling. Please
retry in a few minutes.
> {quote}
> Two possible approaches to improve test infra:
> 1. Add retry to the framework cancellation.
> 2. Instead of wait until pipeline in RUNNING state ([here|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py#L57]),
we want to wait more to make sure worker pool starts successfully.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message