Return-Path: X-Original-To: apmail-flink-dev-archive@www.apache.org Delivered-To: apmail-flink-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 942E618F41 for ; Mon, 30 Nov 2015 18:18:23 +0000 (UTC) Received: (qmail 92740 invoked by uid 500); 30 Nov 2015 18:18:23 -0000 Delivered-To: apmail-flink-dev-archive@flink.apache.org Received: (qmail 92683 invoked by uid 500); 30 Nov 2015 18:18:23 -0000 Mailing-List: contact dev-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list dev@flink.apache.org Received: (qmail 92669 invoked by uid 99); 30 Nov 2015 18:18:23 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Nov 2015 18:18:23 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id BD4FF1809DC for ; Mon, 30 Nov 2015 18:18:22 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.002 X-Spam-Level: * X-Spam-Status: No, score=1.002 tagged_above=-999 required=6.31 tests=[HEADER_FROM_DIFFERENT_DOMAINS=0.001, KAM_LAZY_DOMAIN_SECURITY=1, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id eNSuib8s6r14 for ; Mon, 30 Nov 2015 18:18:12 +0000 (UTC) Received: from mail-wm0-f41.google.com (mail-wm0-f41.google.com [74.125.82.41]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 5B03C20270 for ; Mon, 30 Nov 2015 18:18:12 +0000 (UTC) Received: by wmuu63 with SMTP id u63so141461153wmu.0 for ; Mon, 30 Nov 2015 10:18:06 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:content-type:mime-version:subject:from :in-reply-to:date:content-transfer-encoding:message-id:references:to; bh=cPGkzpExNqQZ4mqfUwXouEZEPa0nA1oTOKhxt5smKzc=; b=ViCSQ2y2YC4QMI26UyYAJ5ShlVqL1i/FRdzY3b6xlacsguvLTzEU7a1srVloAzVPDu Xs5JgfzOKvqKPG4CQdAN+qyxfPJcOddNQiN/BGnLtL3vIvjD4My3tpuFORwBxAsudHnA uwdao9rR8ELapfGAUVJ9fJ/A5ydZcxCW56+FdpegzCUxdoGqWtnUOsiBIXOJHvevelLM QfK/chYIvu8/Ma2hu0ZDQApZBy3NVGGIDbp9CWKzVRZh4t4S+MZVx/vCImpHB5jHxjNU 5hSgttecs8Pk4eSuIkgklpkCPliL5pyjzUYS684kw7jB8ZzNgvZKWyt97J/W83gBIpqH dJmA== X-Gm-Message-State: ALoCoQmPeo/I6y06KDOKu7xpUZ5nbEakjP9m8m+4E6TwzgYpQfOXSyCIFguLmU9VWIEwjRnN7CMr X-Received: by 10.28.234.200 with SMTP id g69mr28531422wmi.97.1448907486085; Mon, 30 Nov 2015 10:18:06 -0800 (PST) Received: from vinci.fritz.box (ip5b40315a.dynamic.kabel-deutschland.de. [91.64.49.90]) by smtp.googlemail.com with ESMTPSA id z10sm22562511wmg.4.2015.11.30.10.18.05 for (version=TLSv1/SSLv3 cipher=OTHER); Mon, 30 Nov 2015 10:18:05 -0800 (PST) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 9.1 \(3096.5\)) Subject: Re: Task Parallelism in a Cluster From: Ufuk Celebi In-Reply-To: Date: Mon, 30 Nov 2015 19:18:04 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: References: To: dev@flink.apache.org X-Mailer: Apple Mail (2.3096.5) > On 30 Nov 2015, at 17:47, Kashmar, Ali wrote: > Do the parallel instances of each task get distributed across the = cluster or is it possible that they all run on the same node? Yes, slots are requested from all nodes of the cluster. But keep in mind = that multiple tasks (forming a local pipeline) can be scheduled to the = same slot (1 slot can hold many tasks). Have you seen this? = https://ci.apache.org/projects/flink/flink-docs-release-0.10/internals/job= _scheduling.html > If they can all run on the same node, what happens when that node = crashes? Does the job manager recreate them using the remaining open = slots? What happens: The job manager tries to restart the program with the same = parallelism. Thus if you have enough free slots available in your = cluster, this works smoothly (so yes, the remaining/available slots are = used) With a YARN cluster the task manager containers are restarted = automatically. In standalone mode, you have to take care of this = yourself. Does this help? =96 Ufuk