flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bikas Saha <bi...@hortonworks.com>
Subject RE: Flink on Tez
Date Sat, 08 Nov 2014 07:16:14 GMT
Nice! Looking forward to working with the Flink community on supporting this
effort in any way we can help.

Bikas

-----Original Message-----
From: Kostas Tzoumas [mailto:ktzoumas@apache.org]
Sent: Friday, November 07, 2014 10:03 AM
To: dev@flink.incubator.apache.org; dev@tez.apache.org
Subject: Flink on Tez

Hello Flink and Tez,

I would like to point you to a first version of Flink running on Tez. This
is a Flink subproject (to be initially contributed to flink-addons) that
allows you to run unmodified Flink programs on top of Apache Tez.

You can get the code here:
https://github.com/ktzoumas/incubator-flink/tree/tez_support

If you want to give it a spin, some basic instructions are here:
https://github.com/ktzoumas/incubator-flink/tree/tez_support/flink-addons/flink-tez


Be warned that this is still work in progress, so you may encounter bugs,
and this has not yet been optimized for performance.

A few words on how it works and the motivation:

The programs pass as usual through the Flink compiler and use the Flink
runtime operators (map, reduce, join, etc, including the Flink facilities
for sorting, hashing, etc). Instead of generating a Flink distributed
program (called "JobGraph" in Flink), we can now also generate a Tez program
(called "DAG" in Tez).

I have been asked why would we want to do that, as Flink has its own
execution engine. Two reasons in my opinion.

First, Tez follows design choices that are geared towards resource
elasticity, whereas the design choices behind Flink's engine are geared more
towards low latency querying and iterative processing. Therefoere, the two
engines can really complement each other. Users can run their Flink programs
in the engine that fits better their use case and setup.

Second, in Flink we have put a lot of effort in separating program assembly
with program execution and architecting the system in layers (APIs, common
API, compiler, data processing runtime, distributed execution engine). The
possibility to swap execution engines is a good showcase of the benefits of
such a layered architecture.

Of course, trying it out and reporting bugs or contributing is very welcome!

Best,
Kostas

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Mime
View raw message