beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guillaume Balaine (JIRA)" <>
Subject [jira] [Commented] (BEAM-2831) Pipeline crashes due to Beam encoder breaking Flink memory management
Date Wed, 07 Mar 2018 16:44:00 GMT


Guillaume Balaine commented on BEAM-2831:

The implication here, is that from 2.1 onwards it is impossible to run any reasonably sized
batch with the FlinkRunner with binary formats like Avro and Protobuf with the default block
size of FileIO...

> Pipeline crashes due to Beam encoder breaking Flink memory management
> ---------------------------------------------------------------------
>                 Key: BEAM-2831
>                 URL:
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>    Affects Versions: 2.0.0, 2.1.0
>         Environment: Flink 1.2.1 and 1.3.0, Java HotSpot and OpenJDK 8, macOS 10.12.6
and unknown Linux
>            Reporter: Reinier Kip
>            Assignee: Aljoscha Krettek
>            Priority: Major
> I’ve been running a Beam pipeline on Flink. Depending on the dataset size and the heap
memory configuration of the jobmanager and taskmanager, I may run into an EOFException, which
causes the job to fail.
> As [discussed on Flink's mailinglist|]
(stacktrace enclosed), Flink catches these EOFExceptions and activates disk spillover. Because
Beam wraps these exceptions, this mechanism fails, the exception travels up the stack, and
the job aborts.
> Hopefully this is enough information and this is something that can be adjusted for in
Beam. I'd be glad to provide more information where needed.

This message was sent by Atlassian JIRA

View raw message