Return-Path: X-Original-To: apmail-spark-issues-archive@minotaur.apache.org Delivered-To: apmail-spark-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 77EC218823 for ; Fri, 16 Oct 2015 20:16:05 +0000 (UTC) Received: (qmail 55610 invoked by uid 500); 16 Oct 2015 20:16:05 -0000 Delivered-To: apmail-spark-issues-archive@spark.apache.org Received: (qmail 55516 invoked by uid 500); 16 Oct 2015 20:16:05 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 55413 invoked by uid 99); 16 Oct 2015 20:16:05 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Oct 2015 20:16:05 +0000 Date: Fri, 16 Oct 2015 20:16:05 +0000 (UTC) From: "Marcelo Vanzin (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (SPARK-11157) Allow Spark to be built without assemblies MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Marcelo Vanzin created SPARK-11157: -------------------------------------- Summary: Allow Spark to be built without assemblies Key: SPARK-11157 URL: https://issues.apache.org/jira/browse/SPARK-11157 Project: Spark Issue Type: Umbrella Components: Build, Spark Core, YARN Reporter: Marcelo Vanzin For reasoning, discussion of pros and cons, and other more detailed information, please see attached doc. The idea is to be able to build a Spark distribution that has just a directory full of jars instead of the huge assembly files we currently have. Getting there requires changes in a bunch of places, I'll try to list the ones I identified in the document, in the order that I think would be needed to not break things: * make streaming backends not be assemblies Since people may depend on the current assembly artifacts in their deployments, we can't really remove them; but we can make them be dummy jars and rely on dependency resolution to download all the jars. PySpark tests would also need some tweaking here. * make examples jar not be an assembly Probably requires tweaks to the {{run-example}} script. The location of the examples jar would have to change (it won't be able to live in the same place as the main Spark jars anymore). * update YARN backend to handle a directory full of jars when launching apps Currently YARN localizes the Spark assembly (depending on the user configuration); it needs to be modified so that it can localize all needed libraries instead of a single jar. * Modify launcher library to handle the jars directory This should be trivial * Modify {{assembly/pom.xml}} to generate assembly or a {{libs}} directory depending on which profile is enabled. We should keep the option to build with the assembly on by default, for backwards compatibility, to give people time to prepare. Filing this bug as an umbrella; please file sub-tasks if you plan to work on a specific part of the issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org