beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (BEAM-3202) Multiple deserializations of PipelineOptions leaks memory
Date Thu, 16 Nov 2017 17:42:00 GMT


ASF GitHub Bot commented on BEAM-3202:

GitHub user lukecwik opened a pull request:

    [BEAM-3202] Ensure that PipelineOptions.getOptionsId is always populated.

    Follow this checklist to help us incorporate your contribution quickly and easily:
     - [ ] Make sure there is a [JIRA issue](
filed for the change (usually before you start working on it).  Trivial changes like typos
do not require a JIRA issue.  Your pull request should address just this issue, without pulling
in other changes.
     - [ ] Each commit in the pull request should have a meaningful subject line and body.
     - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`,
where you replace `BEAM-XXX` with the appropriate JIRA issue.
     - [ ] Write a pull request description that is detailed enough to understand what the
pull request does, how, and why.
     - [ ] Run `mvn clean verify` to make sure basic checks pass. A more thorough check will
be performed on your pull request automatically.
     - [ ] If this contribution is large, please file an Apache [Individual Contributor License

You can merge this pull request into a Git repository by running:

    $ git pull beam3202

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4140
commit 66ac83928219ff71c218a880770f8809f5e6c307
Author: Luke Cwik <>
Date:   2017-11-16T17:40:42Z

    [BEAM-3202] Ensure that PipelineOptions.getOptionsId is always populated.


> Multiple deserializations of PipelineOptions leaks memory
> ---------------------------------------------------------
>                 Key: BEAM-3202
>                 URL:
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>            Reporter: Luke Cwik
>            Assignee: Luke Cwik
>             Fix For: 2.3.0
> In particular, upon deserializing a PipelineOptions object,
> ProxyInvocationHandler.Deserializer
> calls ValueProvider.RuntimeValueProvider.setRuntimeOptions(options) which
> stores the (newly) deserialized PipelineOptions instance in a static map
> inside the RuntimeValueProvider class, where the key is an id obtained by
> calling deserializedOptions.getOptionsId().
> The thing is, performing a serialize-deserialize cycle on a given
> PipelineOptions instance and invoking getOptionsId() yields different
> optionsIds. Therefore, multiple deserializations of the same
> PipelineOptions instance result in new keys being added to the static
> "optionsMap" map inside the ValueProvider.RuntimeValueProvider class.
> The fix is to populate the options id when PipelineOptions is created. This can be tested
by creating a PipelineOptions object and then serializing/deserializing it and ensuring that
it has the same options id as the original.

This message was sent by Atlassian JIRA

View raw message