flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antonio Martínez Carratalá <amarti...@alto-analytics.com>
Subject Flink remote batch execution in dynamic cluster
Date Fri, 28 Feb 2020 09:25:03 GMT
 Hello

I'm working on a project with Flink 1.8. I'm running my code from Java in a
remote Flink as described here
https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/cluster_execution.html
. That part is working, but I want to configure a dynamic Flink cluster to
execute the jobs

Imagine I have users that sometimes need to run a report, this report is
generated with data processed in Flink, whenever a user requests a report I
have to submit a job to a remote Flink cluster, this job execution is heavy
and may require 1 hour to finish

So, I don't want to have 3, 4, 5... Task Managers always running in the
cluster, some times they are idle and other times I don't have enough Task
Managers for all the requests, I want to dynamically create Task Managers
as the jobs are received at the Job Manager, and get rid of them at the end

I see a lot of options to create a cluster in
https://ci.apache.org/projects/flink/flink-docs-release-1.8/ section
[Deployment & Operations] [Clusters & Deployment] like Standalone, YARN,
Mesos, Docker, Kubernetes... but I don't know what would be the most
suitable for my case of use, I'm not an expert in devops and I barely know
about these technologies

Some advice on which technology to use, and maybe some examples, would be
really appreciated

Have in mind that I need to run the job with
ExecutionEnvironment.createRemoteEnvironment(), to upload a jar is not a
valid option for me, it seems to me that not all the options support remote
submission of jobs, but I'm not sure

Thank you

Antonio Martinez

Mime
View raw message