flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: Bushy plan execution
Date Thu, 18 May 2017 10:06:09 GMT
Hi,

Flink does not apply join order optimization (neither in the DataSet nor in
the Table API). Joins are executed in the same order as they are specified.
You can build bushy join plans for SQL by nesting queries:

SELECT *
 FROM (SELECT * FROM X, Y WHERE x = y) AS t1, (SELECT * FROM U, V WHERE u =
v) AS t2)
 WHERE t1.a = t2.b

You can check the execution plan returned by
ExecutionEnvironment.getExecutionPlan() and paste it into the web
visualizer [1].
The TableEnvironment has its own explain command
BatchTableEnvironment.explain(Table) or
StreamTableEnvironment.explain(Table).

Let me know if you have more questions,
Fabian

2017-05-17 10:54 GMT+02:00 Mathias Eriksen Otkjær <meot12@student.aau.dk>:

> Hi
>
> We have attempted to create a bushy logical plan for Flink, but we are not
> certain whether it actually gets executed in parallel or in a linear
> fashion inside Flink (we are certain that it works, as we get the same
> results now as we did using SQL in the table API). How can we confirm that
> such a plan does indeed get executed in the parallel fashion that we expect
> and on different taskmanagers? Should we just trust the framework to
> execute our query plan with operations running in parallel, or should we
> somehow manually set the parallelism of the individual join operations
> inside flink?
>
> Thanks,
>
> Jesper and Mathias
>

Mime
View raw message