flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Till Rohrmann <trohrm...@apache.org>
Subject Re: Need for user class path accessibility on all nodes
Date Tue, 18 Jun 2019 08:41:10 GMT
Hi Abdul,

as Biao said the `--classpath` option should only be used if you want to
make dependencies available which are not included in the submitted user
code jar. E.g. if you have installed a large library which is too costly to
ship every time you submit a job. Usually, you would not need to specify
this option if you build an uber jar.

Cheers,
Till

On Tue, Jun 18, 2019 at 7:23 AM Biao Liu <mmyy1110@gmail.com> wrote:

> Ah, sorry for misunderstanding.
> So what you are asking is that why we need "--classpath"? I'm not sure
> what the original author think of it. I guess the listed below might be
> considered.
> 1. Avoid duplicated deploying. If some common jars are deployed in advance
> to each node of cluster, the jobs depend on these jars could avoid
> deploying one by one.
> 2. Support NFS which is mentioned in option description of "--classpath".
>
>
> Abdul Qadeer <quadeer.leo@gmail.com> 于2019年6月18日周二 上午11:45写道:
>
>> Hi Biao,
>>
>> I am aware of it - that's not my question.
>>
>> On Mon, Jun 17, 2019 at 7:42 PM Biao Liu <mmyy1110@gmail.com> wrote:
>>
>>> Hi Abdul, "--classpath <url>" can be used for those are not included in
>>> user jar. If all your classes are included in your jar passed to Flink, you
>>> don't need this "--classpath".
>>>
>>> Abdul Qadeer <quadeer.leo@gmail.com> 于2019年6月18日周二 上午3:08写道:
>>>
>>>> Hi!
>>>>
>>>> I was going through submission of a Flink program through CLI. I see
>>>> that "--classpath <url>" needs to be accessible from all nodes in the
>>>> cluster as per documentation. As I understand the jar files are already
>>>> part of the blob uploaded to JobManager from the CLI. The TaskManagers can
>>>> download this blob when the receive the task and access the classes from
>>>> there. Why is there a need to be able to access these files from every node
>>>> then? It makes sense to use Distributed File System to access these jars
if
>>>> the network is not reachable to download blob files. Or if the blob doesn't
>>>> contain metadata to differentiate between child class loader classes and
>>>> the rest. However it seems like the TaskManager always tries to access the
>>>> specified class paths irrespective of Network Partitions.
>>>>
>>>>

Mime
View raw message