systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kunft, Andreas" <>
Subject AW: Add Apache Flink as new backend
Date Thu, 03 Mar 2016 17:46:08 GMT

thank you for the fast reply. We are glad you like the idea!

As next step, we will focus on implementing a end-to-end integration based on your suggestions.
We think that this initial integration is a good start for further discussions based on the
concrete implementation in the pull request.



Von: Matthias Boehm <>
Gesendet: Donnerstag, 3. März 2016 06:44
Betreff: Re: Add Apache Flink as new backend

Thanks guys, for sharing the details of this prototype. In general, I really like the idea
of having a Flink backend in SystemML. We just need to structure the code (similar to our
Spark backend) in a way that Flink libraries are not necessarily required when running in
Spark or MapReduce execution modes.

To answer your questions in detail:

1) Shared Functionality: I would recommend to reuse the upper levels (i.e., language, hops,
lops, etc) and core block operations but keep the instructions (and everything that accesses
Flink APIs) independent. Yes, this separation comes at the cost of code duplication but it
allows to run backends without the need for libraries of the other backends. Note that we
did the same for our Spark backend, which allows us to run the same jar in old MapReduce v1
environments where these libraries are not available during runtime. Down the road we might
consolidate common functionality like runtime maintenance of matrix characteristics etc.

2) Execution Modes: Yes, please add two new execution modes in DMLScript.RUNTIME_PLATFORM
and one new execution type in Lop.ExecType. Once this is done, you can already run end-to-end
scripts with '-exec hybrid_flink'. Of course we can have more detailed discussions about how
and when to select Flink operators during operator selection in hybrid_flink mode. As a start,
I would recommend our default heuristic of compiling Flink operators whenever the memory estimate
of an operation exceeds the local memory budget of the driver/client process.

3) Pull Requests: I would recommend multiple stages: (1) initially a minimal end-to-end integration,
(2) multiple packages of "instruction sets" incl tests, (3) specific rewrites / optimizer
extensions, and later (4) continuous improvements. For the initial end-to-end integration,
I would focus on two or three simple yet very important instructions (tsmm, mapmm, mapmmchain),
basic converter utils, and a basic end-to-end integration (execution types, serialization,
buffer pool, etc). Having tsmm and mapmm (plus optionally mapmmchain) would already allow
you to run end-to-end algorithms like LinregDS, LinregCG, GLM, L2SVM, PageRank, etc for common
scenarios where only transpose-self matrix multiplications or matrix-vector multiplications
are compiled to distributed operations while remaining operators are executed in the driver/client
and vectors are small enough to be broadcast to mapmm/mapmmchain.

4) Refactoring runtime package: Again, don't worry about the refactoring of our runtime packages.
The focus is mainly our block runtime, restructuring everything such that this runtime can
be easily packaged as an individual jar and distributed/consumed independently of SystemML.

I'm looking forward to many more discussions on this topic.


[Inactive hide details for "Kunft, Andreas" ---03/02/2016 11:51:16 AM---Hi all, we are a group
of researchers from the Database]"Kunft, Andreas" ---03/02/2016 11:51:16 AM---Hi all, we are
a group of researchers from the Database group (DIMA) at TU Berlin. We would like to

From: "Kunft, Andreas" <>
To: "" <>
Date: 03/02/2016 11:51 AM
Subject: Add Apache Flink as new backend


Hi all,

we are a group of researchers from the Database group (DIMA) at TU Berlin. We would like to
add Apache Flink as an execution backend to SystemML in addition to Hadoop MR and Spark.
To this end we started implementing a proof of concept consisting of several instructions
together with the necessary de-/serialization and execution-logic.
You can see the current state of our fork [1] including two test-cases showing what we currently
support [2][3].

For our simple POC implementation we realized that we had to duplicate a lot of functionality
(especially from spark instructions). We saw that people already raised concerns regarding
the refactoring of the runtime package [4][5], potentially making it easier to integrate further
Given that this would be a bigger change, it would be helpful to get some input from the SystemML
community regarding this effort.

In particular, we would like to discuss the following questions:

How should we deal with shared functionality between the different backends (Flink, Spark,
etc.) to avoid code duplication, especially in instructions, but also introduce modularity?
And is this modularization even desired?
How should we integrate Flink into the different runtime-modes? (Flink-only, Flink-Hybrid,
How should we structure the integration? (multiple/single commits)

We're looking forward to feedback and hope the community likes the idea of adding Flink as
an execution backend to SystemML.

Andreas Kunft
Christoph Brücke
Felix Schüler


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message