Mailing-List: contact dev-help@flink.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@flink.apache.org
Content-Type: text/plain; charset=windows-1252
Mime-Version: 1.0 (Mac OS X Mail 9.1 \(3096.5\))
Subject: Re: Task Parallelism in a Cluster
From: Ufuk Celebi <uce@apache.org>
In-Reply-To: <D281F5F6.2194%ali.kashmar@emc.com>
Date: Mon, 30 Nov 2015 19:18:04 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <DD320F7A-59CB-4696-A128-B71D1E5AC38A@apache.org>
References: <D281F5F6.2194%ali.kashmar@emc.com>
To: dev@flink.apache.org


> On 30 Nov 2015, at 17:47, Kashmar, Ali <Ali.Kashmar@emc.com> wrote:
> Do the parallel instances of each task get distributed across the =
cluster or is it possible that they all run on the same node?

Yes, slots are requested from all nodes of the cluster. But keep in mind =
that multiple tasks (forming a local pipeline) can be scheduled to the =
same slot (1 slot can hold many tasks).

Have you seen this? =
https://ci.apache.org/projects/flink/flink-docs-release-0.10/internals/job=
_scheduling.html

> If they can all run on the same node, what happens when that node =
crashes? Does the job manager recreate them using the remaining open =
slots?

What happens: The job manager tries to restart the program with the same =
parallelism. Thus if you have enough free slots available in your =
cluster, this works smoothly (so yes, the remaining/available slots are =
used)

With a YARN cluster the task manager containers are restarted =
automatically. In standalone mode, you have to take care of this =
yourself.


Does this help?

=96 Ufuk