Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5A3FD200D18 for ; Wed, 11 Oct 2017 11:06:06 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 581C91609E4; Wed, 11 Oct 2017 09:06:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9DED11609CA for ; Wed, 11 Oct 2017 11:06:05 +0200 (CEST) Received: (qmail 99158 invoked by uid 500); 11 Oct 2017 09:06:04 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 99149 invoked by uid 99); 11 Oct 2017 09:06:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Oct 2017 09:06:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id D59C81A5CA8 for ; Wed, 11 Oct 2017 09:06:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id OiyxJIHcTdg3 for ; Wed, 11 Oct 2017 09:06:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id B51155FB2E for ; Wed, 11 Oct 2017 09:06:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id F05CFE0631 for ; Wed, 11 Oct 2017 09:06:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id AD4D521EE3 for ; Wed, 11 Oct 2017 09:06:00 +0000 (UTC) Date: Wed, 11 Oct 2017 09:06:00 +0000 (UTC) From: "Stavros Kontopoulos (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (SPARK-19700) Design an API for pluggable scheduler implementations MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 11 Oct 2017 09:06:06 -0000 [ https://issues.apache.org/jira/browse/SPARK-19700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16199989#comment-16199989 ] Stavros Kontopoulos edited comment on SPARK-19700 at 10/11/17 9:05 AM: ----------------------------------------------------------------------- Anyone working on this? Any design document or plans? was (Author: skonto): Anyone working on this any design document or plans? > Design an API for pluggable scheduler implementations > ----------------------------------------------------- > > Key: SPARK-19700 > URL: https://issues.apache.org/jira/browse/SPARK-19700 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 2.1.0 > Reporter: Matt Cheah > > One point that was brought up in discussing SPARK-18278 was that schedulers cannot easily be added to Spark without forking the whole project. The main reason is that much of the scheduler's behavior fundamentally depends on the CoarseGrainedSchedulerBackend class, which is not part of the public API of Spark and is in fact quite a complex module. As resource management and allocation continues evolves, Spark will need to be integrated with more cluster managers, but maintaining support for all possible allocators in the Spark project would be untenable. Furthermore, it would be impossible for Spark to support proprietary frameworks that are developed by specific users for their other particular use cases. > Therefore, this ticket proposes making scheduler implementations fully pluggable. The idea is that Spark will provide a Java/Scala interface that is to be implemented by a scheduler that is backed by the cluster manager of interest. The user can compile their scheduler's code into a JAR that is placed on the driver's classpath. Finally, as is the case in the current world, the scheduler implementation is selected and dynamically loaded depending on the user's provided master URL. > Determining the correct API is the most challenging problem. The current CoarseGrainedSchedulerBackend handles many responsibilities, some of which will be common across all cluster managers, and some which will be specific to a particular cluster manager. For example, the particular mechanism for creating the executor processes will differ between YARN and Mesos, but, once these executors have started running, the means to submit tasks to them over the Netty RPC is identical across the board. > We must also consider a plugin model and interface for submitting the application as well, because different cluster managers support different configuration options, and thus the driver must be bootstrapped accordingly. For example, in YARN mode the application and Hadoop configuration must be packaged and shipped to the distributed cache prior to launching the job. A prototype of a Kubernetes implementation starts a Kubernetes pod that runs the driver in cluster mode. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org