flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Till Rohrmann <till.rohrm...@gmail.com>
Subject Re: Flink's Checking and uploading JAR files Issue
Date Thu, 24 Sep 2015 13:58:31 GMT
Hi Hanan,

you're right that currently every time you submit a job to the Flink
cluster, all user code jars are uploaded and overwrite possibly existing
files. This is not really necessary if they don't change. Maybe we should
add a check that already existing files on the JobManager are not uploaded
again by the JobClient. This should improve the performance for your use
case.

The corresponding JIRA issue is
https://issues.apache.org/jira/browse/FLINK-2760.

Cheers,
Till

On Thu, Sep 24, 2015 at 1:31 PM, Hanan Meyer <hanan@scalabill.it> wrote:

> Hello All
>
> I use Flink in order to filter data from Hdfs and write it back as CSV.
>
> I keep getting the "Checking and uploading JAR files" on every DataSet
> filtering action or
> executionEnvironment execution.
>
> I use ExecutionEnvironment.createRemoteEnvironment(ip+jars..) because I
> launch Flink from
> a J2EE Aplication Server .
>
> The Jars serialization and transportation takes a huge part of the
> execution time .
> Is there a way to force Flink to pass the Jars only once?
>
> Please advise
>
> Thanks,
>
> Hanan Meyer
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message