beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-2302) WriteFiles with runner-determined sharding and large numbers of windows causes OOM errors
Date Tue, 16 May 2017 04:44:04 GMT

    [ https://issues.apache.org/jira/browse/BEAM-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011733#comment-16011733
] 

ASF GitHub Bot commented on BEAM-2302:
--------------------------------------

GitHub user reuvenlax opened a pull request:

    https://github.com/apache/beam/pull/3161

    [BEAM-2302] Add spilling code to WriteFiles.

    This is similar to the fix for BEAM-2154.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/reuvenlax/incubator-beam windowed_file_scalability

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/3161.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3161
    
----
commit 675b4fa342e11a437c648f48a890e06b9b42e9bb
Author: Reuven Lax <relax@google.com>
Date:   2017-05-13T19:53:08Z

    Add spilling code to WriteFiles.

----


> WriteFiles with runner-determined sharding and large numbers of windows causes OOM errors
> -----------------------------------------------------------------------------------------
>
>                 Key: BEAM-2302
>                 URL: https://issues.apache.org/jira/browse/BEAM-2302
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>            Reporter: Reuven Lax
>            Assignee: Davor Bonaci
>
> This is because the WriteWindowedBundles transform will create many file writers, and
the sheer number of file buffers (which defaults to 64mb per writer) uses up all memory. The
fix is the same as was done in BigQueryIO - if too many writers are opened, spill into a shuffle,
and write the files after the shuffle



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message